Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
Permalink
Browse files

add documentation site

  • Loading branch information...
thammegowda committed Dec 27, 2017
1 parent 80e993d commit 1c05cda7753f3e09fb33e62cd337edba2bbadd56
@@ -0,0 +1,3 @@
*.sw?
_site
_pages
@@ -0,0 +1,3 @@
# Sparkler Docs

Read the docs at http://irds.usc.edu/sparkler
@@ -1 +1,57 @@
theme: jekyll-theme-dinky
# Site title and subtitle. This is used in _includes/header.html
title: 'Sparkler'
subtitle: 'Spark Crawler'

# if you wish to integrate disqus on pages set your shortname here
disqus_shortname: 'Sparkler'

# if you use google analytics, add your tracking id here
google_analytics_id: 'UA-77850818-1'

# Enable/show navigation. There are there options:
# 0 - always hide
# 1 - always show
# 2 - show only if posts are present
navigation: 1

# URL to source code, used in _includes/footer.html
codeurl: 'https://github.com/USCDataScience/sparkler'

# Default categories (in order) to appear in the navigation
sections: [
['doc', 'Documentation'],
['tut', 'Tutorial'],
['ref', 'Reference'],
['dev', 'Developers'],
['post', 'Posts']
]

# Keep as an empty string if served up at the root. If served up at a specific
# path (e.g. on GitHub pages) leave off the trailing slash, e.g. /my-project
baseurl: '/sparkler'

# Dates are not included in permalinks
permalink: none

# Syntax highlighting
highlighter: rouge

# Since these are pages, it doesn't really matter
future: true

# Exclude non-site files
exclude: ['bin', 'README.md', 'presentations', 'proposal', 'Sparkler-Dashboard.png']

# Use the kramdown Markdown renderer
markdown: kramdown
redcarpet:
extensions: [
'no_intra_emphasis',
'fenced_code_blocks',
'autolink',
'strikethrough',
'superscript',
'with_toc_data',
'tables',
'hardwrap'
]
@@ -0,0 +1,13 @@
<div id="disqus_thread"></div>
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = '{{ site.disqus_shortname }}'; // required: replace example with your forum shortname
/* * * DON'T EDIT BELOW THIS LINE * * */
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
@@ -0,0 +1 @@
Documentation for <a href="{% if site.codeurl %}{{ site.codeurl }}{% else %}{{ site.baseurl }}{% endif %}">{{ site.title }}</a>
@@ -0,0 +1,9 @@
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', '{{ site.google_analytics_id }}', 'auto');
ga('send', 'pageview');
</script>
@@ -0,0 +1,3 @@
<h4><a class="brand" href="{{ site.baseurl }}/">{{ site.title }}</a>
{% if site.subtitle %}<small>{{ site.subtitle }}</small>{% endif %}
</h4>
@@ -0,0 +1,16 @@
<ul class="nav nav-list">
<li><a href="{{ site.baseurl }}/">Home</a></li>
{% for section in site.sections %}
{% assign attr = section[0] %}
{% assign label = section[1] %}

{% for page in site.categories[attr] %}
{% if forloop.first %}
<li class="nav-header">{{ label }}</li>
{% endif %}
<li data-order="{{ page.order }}"><a href="{{ site.baseurl }}{{ page.url }}">{{ page.title }}</a></li>
{% endfor %}
{% endfor %}
<!-- List additional links. It is recommended to add a divider
e.g. <li class="divider"></li> first to break up the content. -->
</ul>
@@ -0,0 +1,116 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="width=device-width">

<title>{{ site.title }}{% if page.title %} : {{ page.title }}{% endif %}</title>
<meta name="description" content="{{ site.subtitle }}">

<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css" rel="stylesheet">
<link rel="stylesheet" href="{{ site.baseurl }}/css/syntax.css">
<link rel="stylesheet" href="{{ site.baseurl }}/css/main.css">
</head>
<body>

<div class="container">
<div class="row">
<div id="header" class="col-sm-12">
{% include header.html %}
</div>
</div>

<div class="row">
{% assign post_count = site.posts|size %}
{% if site.navigation != 0 and site.navigation == 1 or post_count > 0 %}
<div id="navigation" class="col-sm-2">
{% include navigation.html %}
</div>

<div id="content" class="col-sm-10">
{{ content }}
</div>
{% else %}
<div id="content" class="col-sm-12">
{{ content }}
</div>
{% endif %}
</div>

{% if page.disqus == 1 %}
<div class="row">
{% if site.navigation == 1 or post_count > 0 %}
<div id="navigation" class="col-sm-2"></div>
<div id="disqus" class="col-sm-10">
{% include disqus.html %}
</div>
{% else %}
<div id="disqus" class="col-sm-12">
{% include disqus.html %}
</div>
{% endif %}
</div>
{% endif %}

<div class="row">
<div id="footer" class="col-sm-12">
{% include footer.html %}
</div>
</div>
</div>

<script>
function orderNav() {
var list,
section,
header,
sections = [],
lists = {},
headers = {};
var navUl = document.querySelectorAll('#navigation ul')[0],
navLis = document.querySelectorAll('#navigation ul li');
if (!navUl) return;
for (var i = 0; i < navLis.length; i++) {
var order, li = navLis[i];
if (li.classList.contains('nav-header')) {
section = li.textContent || li.innerText;
sections.push(section);
headers[section] = li;
continue;
}
if (!lists[section]) {
lists[section] = [];
}
order = parseFloat(li.getAttribute('data-order'))
lists[section].push([order, li]);
}
for (var i = 0; i < sections.length; i++) {
section = sections[i];
list = lists[section].sort(function(a, b) {
return a[0] - b[0];
});
if (header = headers[section]) {
navUl.appendChild(header);
}
for (var j = 0; j < list.length; j++) {
navUl.appendChild(list[j][1]);
}
}
}
if (document.querySelectorAll) orderNav();
</script>
{% if site.google_analytics_id != "" %}
{% include google_analytics.html %}
{% endif %}
</body>
</html>
@@ -0,0 +1,11 @@
---
layout: default
---

<div class="page-header">
<h2>{{ page.title }}
{% if page.subtitle %}<small>{{ page.subtitle }}</small>{% endif %}
</h2>
</div>

{{ content }}
No changes.
@@ -0,0 +1,48 @@
---
layout: page
title: "Contributing to Docs"
category: dev
date: 2017-12-26 15:27:29
---

Contributions are welcome all the way - big or small, including adding tutorials and how-to's!

This page helps how to update documentation.

To add a new page to this website

```bash
ruby bin/jekyll-page "Page Title" <category>
```

`<category>` can be:

- `doc` - Documentation
- `tut` - Tutorial
- `ref` - Reference
- `dev` - Developers
- `post` - Posts

For example, if you want to write a tutorial about **Crawling images using Sparkler**



```bash
ruby bin/jekyll-page "Crawling Images using Sparkler" tut
```

Then edit the markdown file under `_posts/` directory.

Then follow the standard github contribution guideline.
If not already, fork this project from [https://github.com/USCDataScience/sparkler](https://github.com/USCDataScience/sparkler) to https://github.com/<yourId>/sparkler

```bash
git remote add own git@github.com/<yourId>/sparkler
git add docs/_posts/*
git commit -m 'Added documentation for ___'
git push own <branchname>
```

Then raise a pull request at [https://github.com/USCDataScience/sparkler](https://github.com/USCDataScience/sparkler) using the github web UI.

Contact developers on [slack](/sparkler/#slack) if you have questions.
@@ -0,0 +1,97 @@
---
layout: page
title: "Dev Environment Setup"
category: dev
date: 2017-12-26 14:49:45
---

## Requirements for developing it via docker
- JDK 8 - Install it from [https://java.com/en/download/](https://java.com/en/download/)
- Docker - Install it from [https://docs.docker.com/engine/installation/](https://docs.docker.com/engine/installation/)
- Maven - Install it from [https://maven.apache.org/download.cgi](https://maven.apache.org/download.cgi)
- An IDE - Get [Intellij IDEA Community Edition from here](https://www.jetbrains.com/idea/download/)

Docker is a shortcut for quickly launching Solr and admin dashboard using prebuilt image.
If you wish to install Solr natively, then skip docker and install solr from [http://archive.apache.org/dist/lucene/solr/7.1.0/](http://archive.apache.org/dist/lucene/solr/7.1.0/)

## Launch Solr and Banana Dashboard

#### Using Docker:

```bash
docker run -p 8983:8983 --user sparkler -it uscdatascience/sparkler
```

#### Using Solr natively:

Follow instructions highlighted by [this URL](https://github.com/USCDataScience/sparkler/blob/19bff47c669b683c860ff833a00f36a5b8b63686/sparkler-deployment/docker/Dockerfile#L52-L66)

In the default setting, sparkler tries to connect with solr at [http://localhost:8083/solr](http://localhost:8083/solr) and the dashboard at [http://localhost:8083/banana](http://localhost:8083/banana)


## Building the Project

#### Obtaining the Source code for the first time

```bash
git clone git@github.com:USCDataScience/sparkler.git
cd sparkler
```

Launch a terminal and `cd` to the root of project.

#### Building whole project:

```
git pull origin master
mvn clean package
```

The whole project includes API, App, and Plugins. It also runs all the test cases.


To build the core project excluding plugins:
```
mvn clean package -Pcore
```

When build task is SUCCESS, it creates a `build` directory with the following structure

```
build/
┝---bin/ -- Useful scripts
| └- sparkler.sh -- Command line interface
┝-- conf/ -- All the config files
┝-- plugins/ -- All the plugin jars
└--- sparkler-app*.jar -- Application code except plugins
```

## Running a test crawl

```bash
build/bin/sparkler.sh inject -id j1 -su http://<yoursite>.com
build/bin/sparkler.sh crawl -id j1
```

If the above commands deoesn't make sense watch the video below.


## Making changes to code

Import the project to your IDE and edit the source code.
Then follow build instructions in the above.


## Adding a new plugin

<iframe width="600" height="360" src="http://www.youtube.com/embed/Ib8OwmoRj-Q" frameborder="0" allowfullscreen="allowfullscreen"></iframe>


## [Contributing to Project Source Code](#contributing-source)

If you fix bugs or add new features, please raise a pull request on github.
Learn more about github pull requests [here](https://blog.scottlowe.org/2015/01/27/using-fork-branch-git-workflow/)

Contact developers on [slack](/sparkler/#slack) if you have questions.

0 comments on commit 1c05cda

Please sign in to comment.
You can’t perform that action at this time.