Permalink
Please
sign in to comment.
Showing
with
674 additions
and 1 deletion.
- +3 −0 docs/.gitignore
- +3 −0 docs/README.md
- +57 −1 docs/_config.yml
- +13 −0 docs/_includes/disqus.html
- +1 −0 docs/_includes/footer.html
- +9 −0 docs/_includes/google_analytics.html
- +3 −0 docs/_includes/header.html
- +16 −0 docs/_includes/navigation.html
- +116 −0 docs/_layouts/default.html
- +11 −0 docs/_layouts/page.html
- 0 docs/_posts/.gitkeep
- +48 −0 docs/_posts/2017-12-26-contributing-to-docs.md
- +97 −0 docs/_posts/2017-12-26-development-environment-setup.md
- +109 −0 docs/bin/jekyll-page
- 0 docs/changelog.md
- +94 −0 docs/css/main.css
- +61 −0 docs/css/syntax.css
- +33 −0 docs/index.md
@@ -0,0 +1,3 @@ | ||
*.sw? | ||
_site | ||
_pages |
@@ -0,0 +1,3 @@ | ||
# Sparkler Docs | ||
|
||
Read the docs at http://irds.usc.edu/sparkler |
@@ -1 +1,57 @@ | ||
theme: jekyll-theme-dinky | ||
# Site title and subtitle. This is used in _includes/header.html | ||
title: 'Sparkler' | ||
subtitle: 'Spark Crawler' | ||
|
||
# if you wish to integrate disqus on pages set your shortname here | ||
disqus_shortname: 'Sparkler' | ||
|
||
# if you use google analytics, add your tracking id here | ||
google_analytics_id: 'UA-77850818-1' | ||
|
||
# Enable/show navigation. There are there options: | ||
# 0 - always hide | ||
# 1 - always show | ||
# 2 - show only if posts are present | ||
navigation: 1 | ||
|
||
# URL to source code, used in _includes/footer.html | ||
codeurl: 'https://github.com/USCDataScience/sparkler' | ||
|
||
# Default categories (in order) to appear in the navigation | ||
sections: [ | ||
['doc', 'Documentation'], | ||
['tut', 'Tutorial'], | ||
['ref', 'Reference'], | ||
['dev', 'Developers'], | ||
['post', 'Posts'] | ||
] | ||
|
||
# Keep as an empty string if served up at the root. If served up at a specific | ||
# path (e.g. on GitHub pages) leave off the trailing slash, e.g. /my-project | ||
baseurl: '/sparkler' | ||
|
||
# Dates are not included in permalinks | ||
permalink: none | ||
|
||
# Syntax highlighting | ||
highlighter: rouge | ||
|
||
# Since these are pages, it doesn't really matter | ||
future: true | ||
|
||
# Exclude non-site files | ||
exclude: ['bin', 'README.md', 'presentations', 'proposal', 'Sparkler-Dashboard.png'] | ||
|
||
# Use the kramdown Markdown renderer | ||
markdown: kramdown | ||
redcarpet: | ||
extensions: [ | ||
'no_intra_emphasis', | ||
'fenced_code_blocks', | ||
'autolink', | ||
'strikethrough', | ||
'superscript', | ||
'with_toc_data', | ||
'tables', | ||
'hardwrap' | ||
] |
@@ -0,0 +1,13 @@ | ||
<div id="disqus_thread"></div> | ||
<script type="text/javascript"> | ||
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ | ||
var disqus_shortname = '{{ site.disqus_shortname }}'; // required: replace example with your forum shortname | ||
/* * * DON'T EDIT BELOW THIS LINE * * */ | ||
(function() { | ||
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; | ||
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; | ||
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); | ||
})(); | ||
</script> | ||
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> |
@@ -0,0 +1 @@ | ||
Documentation for <a href="{% if site.codeurl %}{{ site.codeurl }}{% else %}{{ site.baseurl }}{% endif %}">{{ site.title }}</a> |
@@ -0,0 +1,9 @@ | ||
<script> | ||
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ | ||
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), | ||
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) | ||
})(window,document,'script','//www.google-analytics.com/analytics.js','ga'); | ||
ga('create', '{{ site.google_analytics_id }}', 'auto'); | ||
ga('send', 'pageview'); | ||
</script> |
@@ -0,0 +1,3 @@ | ||
<h4><a class="brand" href="{{ site.baseurl }}/">{{ site.title }}</a> | ||
{% if site.subtitle %}<small>{{ site.subtitle }}</small>{% endif %} | ||
</h4> |
@@ -0,0 +1,16 @@ | ||
<ul class="nav nav-list"> | ||
<li><a href="{{ site.baseurl }}/">Home</a></li> | ||
{% for section in site.sections %} | ||
{% assign attr = section[0] %} | ||
{% assign label = section[1] %} | ||
|
||
{% for page in site.categories[attr] %} | ||
{% if forloop.first %} | ||
<li class="nav-header">{{ label }}</li> | ||
{% endif %} | ||
<li data-order="{{ page.order }}"><a href="{{ site.baseurl }}{{ page.url }}">{{ page.title }}</a></li> | ||
{% endfor %} | ||
{% endfor %} | ||
<!-- List additional links. It is recommended to add a divider | ||
e.g. <li class="divider"></li> first to break up the content. --> | ||
</ul> |
@@ -0,0 +1,116 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
<head> | ||
<meta charset="utf-8"> | ||
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> | ||
<meta name="viewport" content="width=device-width"> | ||
|
||
<title>{{ site.title }}{% if page.title %} : {{ page.title }}{% endif %}</title> | ||
<meta name="description" content="{{ site.subtitle }}"> | ||
|
||
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css" rel="stylesheet"> | ||
<link rel="stylesheet" href="{{ site.baseurl }}/css/syntax.css"> | ||
<link rel="stylesheet" href="{{ site.baseurl }}/css/main.css"> | ||
</head> | ||
<body> | ||
|
||
<div class="container"> | ||
<div class="row"> | ||
<div id="header" class="col-sm-12"> | ||
{% include header.html %} | ||
</div> | ||
</div> | ||
|
||
<div class="row"> | ||
{% assign post_count = site.posts|size %} | ||
{% if site.navigation != 0 and site.navigation == 1 or post_count > 0 %} | ||
<div id="navigation" class="col-sm-2"> | ||
{% include navigation.html %} | ||
</div> | ||
|
||
<div id="content" class="col-sm-10"> | ||
{{ content }} | ||
</div> | ||
{% else %} | ||
<div id="content" class="col-sm-12"> | ||
{{ content }} | ||
</div> | ||
{% endif %} | ||
</div> | ||
|
||
{% if page.disqus == 1 %} | ||
<div class="row"> | ||
{% if site.navigation == 1 or post_count > 0 %} | ||
<div id="navigation" class="col-sm-2"></div> | ||
<div id="disqus" class="col-sm-10"> | ||
{% include disqus.html %} | ||
</div> | ||
{% else %} | ||
<div id="disqus" class="col-sm-12"> | ||
{% include disqus.html %} | ||
</div> | ||
{% endif %} | ||
</div> | ||
{% endif %} | ||
|
||
<div class="row"> | ||
<div id="footer" class="col-sm-12"> | ||
{% include footer.html %} | ||
</div> | ||
</div> | ||
</div> | ||
|
||
<script> | ||
function orderNav() { | ||
var list, | ||
section, | ||
header, | ||
sections = [], | ||
lists = {}, | ||
headers = {}; | ||
var navUl = document.querySelectorAll('#navigation ul')[0], | ||
navLis = document.querySelectorAll('#navigation ul li'); | ||
if (!navUl) return; | ||
for (var i = 0; i < navLis.length; i++) { | ||
var order, li = navLis[i]; | ||
if (li.classList.contains('nav-header')) { | ||
section = li.textContent || li.innerText; | ||
sections.push(section); | ||
headers[section] = li; | ||
continue; | ||
} | ||
if (!lists[section]) { | ||
lists[section] = []; | ||
} | ||
order = parseFloat(li.getAttribute('data-order')) | ||
lists[section].push([order, li]); | ||
} | ||
for (var i = 0; i < sections.length; i++) { | ||
section = sections[i]; | ||
list = lists[section].sort(function(a, b) { | ||
return a[0] - b[0]; | ||
}); | ||
if (header = headers[section]) { | ||
navUl.appendChild(header); | ||
} | ||
for (var j = 0; j < list.length; j++) { | ||
navUl.appendChild(list[j][1]); | ||
} | ||
} | ||
} | ||
if (document.querySelectorAll) orderNav(); | ||
</script> | ||
{% if site.google_analytics_id != "" %} | ||
{% include google_analytics.html %} | ||
{% endif %} | ||
</body> | ||
</html> |
@@ -0,0 +1,11 @@ | ||
--- | ||
layout: default | ||
--- | ||
|
||
<div class="page-header"> | ||
<h2>{{ page.title }} | ||
{% if page.subtitle %}<small>{{ page.subtitle }}</small>{% endif %} | ||
</h2> | ||
</div> | ||
|
||
{{ content }} |
No changes.
@@ -0,0 +1,48 @@ | ||
--- | ||
layout: page | ||
title: "Contributing to Docs" | ||
category: dev | ||
date: 2017-12-26 15:27:29 | ||
--- | ||
|
||
Contributions are welcome all the way - big or small, including adding tutorials and how-to's! | ||
|
||
This page helps how to update documentation. | ||
|
||
To add a new page to this website | ||
|
||
```bash | ||
ruby bin/jekyll-page "Page Title" <category> | ||
``` | ||
|
||
`<category>` can be: | ||
|
||
- `doc` - Documentation | ||
- `tut` - Tutorial | ||
- `ref` - Reference | ||
- `dev` - Developers | ||
- `post` - Posts | ||
|
||
For example, if you want to write a tutorial about **Crawling images using Sparkler** | ||
|
||
|
||
|
||
```bash | ||
ruby bin/jekyll-page "Crawling Images using Sparkler" tut | ||
``` | ||
|
||
Then edit the markdown file under `_posts/` directory. | ||
|
||
Then follow the standard github contribution guideline. | ||
If not already, fork this project from [https://github.com/USCDataScience/sparkler](https://github.com/USCDataScience/sparkler) to https://github.com/<yourId>/sparkler | ||
|
||
```bash | ||
git remote add own git@github.com/<yourId>/sparkler | ||
git add docs/_posts/* | ||
git commit -m 'Added documentation for ___' | ||
git push own <branchname> | ||
``` | ||
|
||
Then raise a pull request at [https://github.com/USCDataScience/sparkler](https://github.com/USCDataScience/sparkler) using the github web UI. | ||
|
||
Contact developers on [slack](/sparkler/#slack) if you have questions. |
@@ -0,0 +1,97 @@ | ||
--- | ||
layout: page | ||
title: "Dev Environment Setup" | ||
category: dev | ||
date: 2017-12-26 14:49:45 | ||
--- | ||
|
||
## Requirements for developing it via docker | ||
- JDK 8 - Install it from [https://java.com/en/download/](https://java.com/en/download/) | ||
- Docker - Install it from [https://docs.docker.com/engine/installation/](https://docs.docker.com/engine/installation/) | ||
- Maven - Install it from [https://maven.apache.org/download.cgi](https://maven.apache.org/download.cgi) | ||
- An IDE - Get [Intellij IDEA Community Edition from here](https://www.jetbrains.com/idea/download/) | ||
|
||
Docker is a shortcut for quickly launching Solr and admin dashboard using prebuilt image. | ||
If you wish to install Solr natively, then skip docker and install solr from [http://archive.apache.org/dist/lucene/solr/7.1.0/](http://archive.apache.org/dist/lucene/solr/7.1.0/) | ||
|
||
## Launch Solr and Banana Dashboard | ||
|
||
#### Using Docker: | ||
|
||
```bash | ||
docker run -p 8983:8983 --user sparkler -it uscdatascience/sparkler | ||
``` | ||
|
||
#### Using Solr natively: | ||
|
||
Follow instructions highlighted by [this URL](https://github.com/USCDataScience/sparkler/blob/19bff47c669b683c860ff833a00f36a5b8b63686/sparkler-deployment/docker/Dockerfile#L52-L66) | ||
|
||
In the default setting, sparkler tries to connect with solr at [http://localhost:8083/solr](http://localhost:8083/solr) and the dashboard at [http://localhost:8083/banana](http://localhost:8083/banana) | ||
|
||
|
||
## Building the Project | ||
|
||
#### Obtaining the Source code for the first time | ||
|
||
```bash | ||
git clone git@github.com:USCDataScience/sparkler.git | ||
cd sparkler | ||
``` | ||
|
||
Launch a terminal and `cd` to the root of project. | ||
|
||
#### Building whole project: | ||
|
||
``` | ||
git pull origin master | ||
mvn clean package | ||
``` | ||
|
||
The whole project includes API, App, and Plugins. It also runs all the test cases. | ||
|
||
|
||
To build the core project excluding plugins: | ||
``` | ||
mvn clean package -Pcore | ||
``` | ||
|
||
When build task is SUCCESS, it creates a `build` directory with the following structure | ||
|
||
``` | ||
build/ | ||
┝---bin/ -- Useful scripts | ||
| └- sparkler.sh -- Command line interface | ||
┝-- conf/ -- All the config files | ||
┝-- plugins/ -- All the plugin jars | ||
└--- sparkler-app*.jar -- Application code except plugins | ||
``` | ||
|
||
## Running a test crawl | ||
|
||
```bash | ||
build/bin/sparkler.sh inject -id j1 -su http://<yoursite>.com | ||
build/bin/sparkler.sh crawl -id j1 | ||
``` | ||
|
||
If the above commands deoesn't make sense watch the video below. | ||
|
||
|
||
## Making changes to code | ||
|
||
Import the project to your IDE and edit the source code. | ||
Then follow build instructions in the above. | ||
|
||
|
||
## Adding a new plugin | ||
|
||
<iframe width="600" height="360" src="http://www.youtube.com/embed/Ib8OwmoRj-Q" frameborder="0" allowfullscreen="allowfullscreen"></iframe> | ||
|
||
|
||
## [Contributing to Project Source Code](#contributing-source) | ||
|
||
If you fix bugs or add new features, please raise a pull request on github. | ||
Learn more about github pull requests [here](https://blog.scottlowe.org/2015/01/27/using-fork-branch-git-workflow/) | ||
|
||
Contact developers on [slack](/sparkler/#slack) if you have questions. | ||
|
Oops, something went wrong.
0 comments on commit
1c05cda