Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upGet Dataverse running on OpenShift (Docker and Kubernetes) #4040
Comments
pdurbin
added
Feature: Installer
User Role: Sysadmin
labels
Aug 3, 2017
This comment has been minimized.
This comment has been minimized.
I mentioned this issue to @bjonnh yesterday in chat because I've tagged him as the primary contact for the dev effort by the community to work on Docker support in #3938. He said that free OpenShift accounts would be helpful for him, @donsizemore, and anyone else who wants to help with the effort to get Dataverse running on OpenShift. |
This comment has been minimized.
This comment has been minimized.
You can get a free account with the starter tier: https://www.openshift.com/pricing/index.html That gets you 1GB of free memory. You can also use https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md or minishift: https://www.openshift.org/minishift/ to run OpenShift on your laptop. |
This comment has been minimized.
This comment has been minimized.
Rules for writing good images: https://docs.openshift.org/latest/creating_images/guidelines.html How to set the memory based on the cgroup: There is an example in here: https://blog.openshift.com/managing-compute-resources-openshiftkubernetes/ Look under "Writing Applications". And here is an example from mysql: https://github.com/sclorg/mysql-container/blob/master/5.5/root/usr/bin/cgroup-limits#L48 Our postgresql image: https://hub.docker.com/r/openshift/postgresql-92-centos7/ Example templates: https://github.com/openshift/origin/tree/master/examples hello-openshift is a fine place to start and move on to sample-app |
This comment has been minimized.
This comment has been minimized.
@danmcp thanks! @portante @danmcp @landreev @scolapasta and I had a great meeting today. Here's a picture of the whiteboard: @danmcp already did his to do list items and my todo list item is to take the latest images from https://hub.docker.com/r/ndslabs/dataverse/ and reference them in a new file at Basically, we'll be trying to see what breaks when we try to deploy the NDS Labs images "as is" to OpenShift. From the whiteboard, we'll need to dig into these questions about the DNS images in order to make sure they run on OpenShift:
We are deferring the following concerns until the future:
Basically, the definition of done for this issue is that someone interested in kicking the tires on Dataverse for non-production use will be able to spin it up for the free 1 GB Openshift "starter" plan. The whiteboard drawing offers some clues on what the pull request might look like. In our @danmcp I swung by @djbrooke 's office and we'd like to figure out when a good time to put this into a sprint would be. The first available would start next Wednesday, Sep 13 and go for two weeks. Let's not pick a time when you're on vacation! |
This comment has been minimized.
This comment has been minimized.
@pdurbin Many of the examples are in json. The templates can be either. I should be around most of the time over the next few weeks. |
djbrooke
added
Status: Ready
Status: This/Next Sprint
and removed
Status: Ready
labels
Sep 8, 2017
pdurbin
added
Status: Development
and removed
Status: This/Next Sprint
labels
Sep 13, 2017
pdurbin
self-assigned this
Sep 13, 2017
pdurbin
removed
the
User Role: Sysadmin
label
Sep 13, 2017
added a commit
that referenced
this issue
Sep 14, 2017
This comment has been minimized.
This comment has been minimized.
@danmcp awesome. Today I signed up for an OpenShift account and went through https://docs.openshift.com/online/getting_started/index.html . That doc is slightly out of date and I got some weird errors along the way (I grabbed a screenshot if you want it) but eventually they resolved themselves and I could see at http://nodejs-mongo-persistent-pdurbin-example.1d35.starter-us-east-1.openshiftapps.com the simple change I made at pdurbin/nodejs-ex@c6efab9 . Great. I looked at https://github.com/openshift/origin/blob/v3.7.0-alpha.1/examples/hello-openshift/hello-project.json and noticed that there were no containers in there so I added a containers array under "spec" and include some images from NDS Labs. I'm currently blocked on the error "cannot create projects at the cluster scope" and left a note about this d287772 which is the first commit of a new |
pdurbin
referenced this issue
Sep 14, 2017
Open
what to do to update to the latest dataverse 4.6.2 #8
This comment has been minimized.
This comment has been minimized.
@pdurbin In your case, you're not going to want to create a project but rather import into an existing project. You should have a kind of template like this one: |
djbrooke
assigned
dlmurphy
Sep 15, 2017
added a commit
that referenced
this issue
Sep 15, 2017
This comment has been minimized.
This comment has been minimized.
@danmcp thanks, in 77b3f67 I switched from "Project" to "Template" and stubbed out in the Dataverse dev guide how to use Minishift, which I just installed and have been playing with (with some guidance from @pameyer ). I was able to expose a route but I'm not sure how to expose the Docker image at https://hub.docker.com/r/ndslabs/dataverse/ within my installation of Minishift. Any advice? |
This comment has been minimized.
This comment has been minimized.
@pdurbin You will just reference ndslabs/dataverse from an imagestream like this: Then from your container you would reference the imagestream like this: with the name of the image stream you picked. |
This comment has been minimized.
This comment has been minimized.
@danmcp thanks! I tried at pdurbin@e1e492f (pushed to my personal repo this time because of the error below) but I got a crazy error:
|
This comment has been minimized.
This comment has been minimized.
It obviously shouldn't give that error but it doesn't like your json. Try this one:
|
added a commit
that referenced
this issue
Sep 15, 2017
This comment has been minimized.
This comment has been minimized.
@danmcp thanks! Added in 4702e0a. Under "Applications" there are now entries under "Deployments" and "Pods" which seems like great progress, but I'm getting this in the log:
Here's a screenshot: |
This comment has been minimized.
This comment has been minimized.
Scratch that. I tried again and now I'm getting this:
This error seems to be coming from https://github.com/nds-org/ndslabs-dataverse/blob/9ddc9efa54185ffd69e25487159a09c4bb2e56bf/dockerfiles/dataverse/entrypoint.sh#L69 https://github.com/nds-org/ndslabs-dataverse/blob/9ddc9efa54185ffd69e25487159a09c4bb2e56bf/dockerfiles/README.md#starting-dataverse-under-docker has some nice information about how you have to start PostgreSQL and Solr before starting Dataverse, which makes sense. |
This comment has been minimized.
This comment has been minimized.
@pdurbin Similar to this example: You're going to want to add postgres and solr to the same template. And have env vars generated to connect them all together. |
added a commit
that referenced
this issue
Sep 18, 2017
This comment has been minimized.
This comment has been minimized.
@danmcp thanks, I made some progress, I think, by adding the
|
pdurbin
self-assigned this
Mar 8, 2018
added a commit
that referenced
this issue
Mar 9, 2018
added a commit
that referenced
this issue
Mar 9, 2018
added a commit
that referenced
this issue
Mar 12, 2018
added a commit
that referenced
this issue
Mar 12, 2018
This comment has been minimized.
This comment has been minimized.
I had a nice meeting on Friday with @danmcp @DirectXMan12 @patrickdillon @MichaelClifford Ashwin and Ryan (sorry, I don't know your GitHub username). I merged pull request #4501 from @danmcp to fix up our OpenShift config. Based on feedback in that meeting I made some improvements to the OpenShift and Docker sections of the dev guide in pull request #4500. Heads up that as part of #4419 I'm moving that content to a dedicated page (see 6bee8d1 for example). |
pdurbin
added
Status: Ready
and removed
Status: Development
labels
Mar 12, 2018
pdurbin
removed their assignment
Mar 12, 2018
This comment has been minimized.
This comment has been minimized.
@patrickdillon discovered that on the "develop" branch (I just tested 5ed5edf), when you create a dataverse it is not indexed into Solr (thanks!). The UI doesn't show the facets (screenshot attached) and in server.log we see errors like this:
To be honest, I don't remember if indexing ever worked in the OpenShift environment. The main way I've been testing is by logging in. Here's a screenshot of how the dataverse I just created isn't indexed: 2018-03-21 Update: The screenshot above is actually a bad example because it's expected that a dataverse you just created doesn't have any children. However, I re-tested this yesterday and it really is broken. If you navigate to the root, nothing shows as being indexed. I'm hoping to fix this as part of the upgrade to Solr 7 in #4158. |
Mar 18, 2018
This was referenced
added a commit
that referenced
this issue
Mar 22, 2018
This comment has been minimized.
This comment has been minimized.
Ok, I just tweaked our openshift config in 493badf an got Solr working in that branch/pull request, which hasn't been merged yet. The tag on DockerHub is called "4158-update-solr" if anyone wants to try it out. This is the branch where we're upgrading from Solr 4 to Solr 7 so when it gets merged, we'll need to push new images to the "latest" tag on Docker Hub. I should note that because I was struggling so mightily with getting Solr 7 working in openshift, I sort of gave up and started running it in /tmp over at 94786ff . At some point I'll work with @danmcp or @DirectXMan12 or some other OpenShift guru to either make this right or more preferably, move to a standard openshift-compatible image that we don't have to build ourselves, like we do with postgres (we use postgres image from centos). This one from @dudash might be a candidate but I haven't tried it yet: https://github.com/dudash/openshift-docker-solr Anyway, the other fix was to restart Glassfish to pick up the change to the |
This comment has been minimized.
This comment has been minimized.
Pull request #4520 was merged yesterday and I just ran
This means that Solr has been upgraded to Solr 7 in that image and the Dataverse war file has been updated to a version (commit 037cb9c) that's compatible with it. Again, there's technical debt in that Solr image (it's running out of |
patrickdillon
referenced this issue
Apr 18, 2018
Closed
Add Postgresql Statefulsets with Replication to OpenShift/Kubernetes #4598
MichaelClifford
referenced this issue
Apr 25, 2018
Closed
Add Glassfish Statefulsets to OpenShift/Kubernetes #4617
added a commit
to EC528-Dataverse-Scaling/dataverse
that referenced
this issue
Apr 30, 2018
This comment has been minimized.
This comment has been minimized.
Just a quick note to say that I just pushed images to Docker Hub as of 639715d which includes the following changes:
I did a simple test of creating a dataverse and making sure that it's indexed. It seems fine. Yesterday I went to the final demo of the stateful sets work (two of the pull requests above) that was contributed by the BU students @danmcp @DirectXMan12 and I have been mentoring all semester. I highly recommend watching their final video at https://github.com/BU-NU-CLOUD-SP18/Dataverse-Scaling#our-project-video which explains what they were up to. We're still a long way from having a production-ready environment on OpenShift for running Dataverse, but these stateful sets will help us scale Glassfish and Postgres independently in the future. As a bonus, since the project included some load testing, check out JMeter script that have been added to #4201. A huge THANK YOU to these students for all of their hard work: Patrick Dillon, Michael Clifford, Ashwin Pillai, and Ryan Morano. See also the thread at https://groups.google.com/d/msg/dataverse-community/TSxf4MTYYjg/7VJB_-GJBAAJ On a related note, another group of BU students in the same class worked on a project related to Dataverse and OpenShift. See the "Spark and Dataverse (Big Data Containers, computation)" thread for more: https://groups.google.com/d/msg/dataverse-community/P4llZSssZ2Q/zvhGltLpAQAJ . Thank you to them as well! |
This comment has been minimized.
This comment has been minimized.
Just watched the final video and it was super cool to see this in action! You guys did a great job and it is awesome to see Dataverse being able to take the steps towards being fully scalable. |
This comment has been minimized.
This comment has been minimized.
This is good news and the work is in the same direction as UiT's goals for Dataverse |
pdurbin commentedAug 3, 2017
Yesterday I met with @portante and @danmcp and talked a fair amount about the possibility of getting Dataverse running on OpenShift. There are multiple reasons why I'm interested in this:
Getting Dataverse running on Openshift isn't on our roadmap so I've created this issue so we can estimate it in sprint planning or backlog grooming. Anyone reading this is very welcome to leave comments or ask questions!