Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Dataverse running on OpenShift (Docker and Kubernetes) #4040

Closed
pdurbin opened this Issue Aug 3, 2017 · 86 comments

Comments

Projects
None yet
@pdurbin
Copy link
Member

pdurbin commented Aug 3, 2017

Yesterday I met with @portante and @danmcp and talked a fair amount about the possibility of getting Dataverse running on OpenShift. There are multiple reasons why I'm interested in this:

  • Getting Dataverse running on OpenShift is prerequisite for @portante at Red Hat running it. I'm excited about a potential first non-edu customer and was happy to meet Peter at the 2017 Dataverse Community Meeting!
  • Getting Dataverse running on OpenShift would greatly improve our "kick the tires on Dataverse" story, which involves running Dataverse on Vagrant on your laptop. You could walk your laptop over to your boss but it would be much nicer to send her a link to Dataverse running in the cloud (such as on OpenShift) so she can kick the tires herself.
  • Getting Dataverse running on OpenShift would help move #3938 forward because we would very likely build on the outstanding work by @craig-willis at @nds-org to Dockerize Dataverse to be part of NDS Labs Workbench. Related to this is that if we get Dataverse running on OpenShift, it should be fairly straightforward to get Dataverse running on other Kubernetes-based cloud offerings such as Google Container Engine (GKE).
  • Getting Dataverse running on OpenShift might help us improve the "make it possible to hack on Dataverse from a Windows computer" in #3927 because I believe @danmcp mentioned that OpenShift can be used in a development context. It reminds me a bit of how @ferrys says she doesn't install Glassfish, PostgreSQL, and Solr directly on her Mac but rather pushes code to an instance she has running on @CCI-MOC. Our current solution for Windows developers wanting to hack on Dataverse is to use Vagrant, which works but is slow and problematic.
  • Getting Dataverse running on OpenShift would open up the possibility of running https://dataverse.harvard.edu (or other installations of Dataverse) on AWS using something called OpenShift Dedicated, which supports both AWS and GKE (and Azure in the future). Related to this is that the "kick the tires" story could more easily transition into "go live" if OpenShift is used along the way.

Getting Dataverse running on Openshift isn't on our roadmap so I've created this issue so we can estimate it in sprint planning or backlog grooming. Anyone reading this is very welcome to leave comments or ask questions!

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 7, 2017

I mentioned this issue to @bjonnh yesterday in chat because I've tagged him as the primary contact for the dev effort by the community to work on Docker support in #3938. He said that free OpenShift accounts would be helpful for him, @donsizemore, and anyone else who wants to help with the effort to get Dataverse running on OpenShift.

@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 7, 2017

You can get a free account with the starter tier:

https://www.openshift.com/pricing/index.html

That gets you 1GB of free memory. You can also use oc cluster up

https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md

or minishift:

https://www.openshift.org/minishift/

to run OpenShift on your laptop.

@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 7, 2017

Rules for writing good images:

https://docs.openshift.org/latest/creating_images/guidelines.html

How to set the memory based on the cgroup:

There is an example in here:

https://blog.openshift.com/managing-compute-resources-openshiftkubernetes/

Look under "Writing Applications". And here is an example from mysql:

https://github.com/sclorg/mysql-container/blob/master/5.5/root/usr/bin/cgroup-limits#L48

Our postgresql image:

https://hub.docker.com/r/openshift/postgresql-92-centos7/

Example templates:

https://github.com/openshift/origin/tree/master/examples

hello-openshift is a fine place to start and move on to sample-app

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 7, 2017

@danmcp thanks!

@portante @danmcp @landreev @scolapasta and I had a great meeting today. Here's a picture of the whiteboard:

img_20170907_130127

@danmcp already did his to do list items and my todo list item is to take the latest images from https://hub.docker.com/r/ndslabs/dataverse/ and reference them in a new file at conf/openshift/openshift.yaml ( @danmcp I'm seeing a YAML example at https://docs.openshift.org/latest/dev_guide/templates.html#writing-templates but not in https://github.com/openshift/origin/tree/master/examples/hello-openshift )

Basically, we'll be trying to see what breaks when we try to deploy the NDS Labs images "as is" to OpenShift. From the whiteboard, we'll need to dig into these questions about the DNS images in order to make sure they run on OpenShift:

  • Do the images run as root?
  • Is the storage abstracted?
  • Can we use the CentOS PosgreSQL image where we already know that the image doesn't run as root and the storage is already abstracted?
  • Have you tuned the memory on Java apps (Solr and Glassfish) based on running inside a container with cgroups?

We are deferring the following concerns until the future:

  • clustering PostgreSQL
  • clustering Solr
  • running multiple Glassfish servers
  • a pipeline for updating Dataverse within OpenShift (new Dataverse release, updated config for Dataverse, etc.)

Basically, the definition of done for this issue is that someone interested in kicking the tires on Dataverse for non-production use will be able to spin it up for the free 1 GB Openshift "starter" plan. The whiteboard drawing offers some clues on what the pull request might look like. In our conf directory, we'll have a Dockerfile each for Solr, PosgreSQL, and Dataverse+Glassfish. We'll have a build script to create the images and push them to DockerHub (I'll create an account for IQSS). We'll have the Openshift YAML file I mentioned above. We'll have some docs for people who want to kick the tires on Dataverse.

@danmcp I swung by @djbrooke 's office and we'd like to figure out when a good time to put this into a sprint would be. The first available would start next Wednesday, Sep 13 and go for two weeks. Let's not pick a time when you're on vacation! 😄

@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 7, 2017

@pdurbin Many of the examples are in json. The templates can be either.

I should be around most of the time over the next few weeks.

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 14, 2017

@danmcp awesome. Today I signed up for an OpenShift account and went through https://docs.openshift.com/online/getting_started/index.html . That doc is slightly out of date and I got some weird errors along the way (I grabbed a screenshot if you want it) but eventually they resolved themselves and I could see at http://nodejs-mongo-persistent-pdurbin-example.1d35.starter-us-east-1.openshiftapps.com the simple change I made at pdurbin/nodejs-ex@c6efab9 . Great.

I looked at https://github.com/openshift/origin/blob/v3.7.0-alpha.1/examples/hello-openshift/hello-project.json and noticed that there were no containers in there so I added a containers array under "spec" and include some images from NDS Labs.

I'm currently blocked on the error "cannot create projects at the cluster scope" and left a note about this d287772 which is the first commit of a new 4040-docker-openshift branch I pushed to this repo. Can you please take a look at that commit and let me know what I'm doing wrong? Thanks!

@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 14, 2017

@pdurbin In your case, you're not going to want to create a project but rather import into an existing project. You should have a kind of template like this one:

https://github.com/openshift/origin/blob/master/examples/sample-app/application-template-stibuild.json

pdurbin added a commit that referenced this issue Sep 15, 2017

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 15, 2017

@danmcp thanks, in 77b3f67 I switched from "Project" to "Template" and stubbed out in the Dataverse dev guide how to use Minishift, which I just installed and have been playing with (with some guidance from @pameyer ). I was able to expose a route but I'm not sure how to expose the Docker image at https://hub.docker.com/r/ndslabs/dataverse/ within my installation of Minishift. Any advice?

@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 15, 2017

@pdurbin You will just reference ndslabs/dataverse from an imagestream like this:

https://github.com/openshift/origin/blob/master/examples/sample-app/application-template-stibuild.json#L83

Then from your container you would reference the imagestream like this:

https://github.com/openshift/origin/blob/master/examples/sample-app/application-template-stibuild.json#L247

with the name of the image stream you picked.

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 15, 2017

@danmcp thanks! I tried at pdurbin@e1e492f (pushed to my personal repo this time because of the error below) but I got a crazy error:

murphy:dataverse pdurbin$ oc new-app conf/openshift/openshift.json 
--> Deploying template "project1/dataverse" for "conf/openshift/openshift.json" to project project1

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x90 pc=0xe4846d]

goroutine 1 [running]:
panic(0x33760e0, 0xc420010080)
	/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/openshift/origin/pkg/util.addDeploymentConfigNestedLabels(0xc420640d20, 0xc42115eab0, 0x2, 0x0, 0xc42115ebd0)
	/go/src/github.com/openshift/origin/pkg/util/labels.go:181 +0x2d
github.com/openshift/origin/pkg/util.AddObjectLabelsWithFlags(0x5ba8fe0, 0xc420640d20, 0xc42115eab0, 0x2, 0xc420640d20, 0x0)
	/go/src/github.com/openshift/origin/pkg/util/labels.go:44 +0x846
github.com/openshift/origin/pkg/cmd/cli/cmd.hasLabel(0xc42115eab0, 0xc420ea12c0, 0xc421155a28, 0xc421155a18, 0xc42115eab0)
	/go/src/github.com/openshift/origin/pkg/cmd/cli/cmd/newapp.go:580 +0x102
github.com/openshift/origin/pkg/cmd/cli/cmd.(*NewAppOptions).RunNewApp(0xc420c3a538, 0x0, 0x2)
	/go/src/github.com/openshift/origin/pkg/cmd/cli/cmd/newapp.go:300 +0x1433
github.com/openshift/origin/pkg/cmd/cli/cmd.NewCmdNewApplication.func1(0xc4202a6900, 0xc420f94da0, 0x1, 0x1)
	/go/src/github.com/openshift/origin/pkg/cmd/cli/cmd/newapp.go:209 +0x10a
github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).execute(0xc4202a6900, 0xc420f94d30, 0x1, 0x1, 0xc4202a6900, 0xc420f94d30)
	/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:603 +0x439
github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc4202b0240, 0xc42002a008, 0xc42002a018, 0xc4202b0240)
	/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:689 +0x367
github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).Execute(0xc4202b0240, 0x2, 0xc4202b0240)
	/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:648 +0x2b
main.main()
	/go/src/github.com/openshift/origin/cmd/oc/oc.go:36 +0x196
murphy:dataverse pdurbin$ 
@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 15, 2017

It obviously shouldn't give that error but it doesn't like your json. Try this one:

{
   "kind":"Template",
   "apiVersion":"v1",
   "metadata":{
      "name":"dataverse",
      "labels":{
         "name":"dataverse"
      },
      "annotations":{
         "openshift.io/description":"Dataverse is open source research data repository software: https://dataverse.org",
         "openshift.io/display-name":"Dataverse"
      }
   },
   "objects":[
      {
         "kind":"Service",
         "apiVersion":"v1",
         "metadata":{
            "name":"dataverse-glassfish-service"
         },
         "spec":{
            "ports":[
               {
                  "name":"web",
                  "protocol":"TCP",
                  "port":8080,
                  "targetPort":8080
               }
            ]
         }
      },
      {
         "kind":"ImageStream",
         "apiVersion":"v1",
         "metadata":{
            "name":"ndslabs-dataverse"
         },
         "spec":{
            "dockerImageRepository":"ndslabs/dataverse"
         }
      },
      {
         "kind":"DeploymentConfig",
         "apiVersion":"v1",
         "metadata":{
            "name":"dataverse-glassfish",
            "annotations":{
               "template.alpha.openshift.io/wait-for-ready":"true"
            }
         },
         "spec":{
            "template":{
               "metadata":{
                  "labels":{
                     "name":"ndslabs-dataverse"
                  }
               },
               "spec":{
                  "containers":[
                     {
                        "name":"ndslabs-dataverse",
                        "image":"ndslabs-dataverse",
                        "ports":[
                           {
                              "containerPort":8080,
                              "protocol":"TCP"
                           }
                        ],
                        "imagePullPolicy":"IfNotPresent",
                        "securityContext":{
                           "capabilities":{

                           },
                           "privileged":false
                        }
                     }
                  ]
               }
            },
            "strategy":{
               "type":"Rolling",
               "rollingParams":{
                  "updatePeriodSeconds":1,
                  "intervalSeconds":1,
                  "timeoutSeconds":120
               },
               "resources":{

               }
            },
            "triggers":[
               {
                  "type":"ImageChange",
                  "imageChangeParams":{
                     "automatic":true,
                     "containerNames":[
                        "ndslabs-dataverse"
                     ],
                     "from":{
                        "kind":"ImageStreamTag",
                        "name":"ndslabs-dataverse:latest"
                     }
                  }
               },
               {
                  "type":"ConfigChange"
               }
            ],
            "replicas":1,
            "selector":{
               "name":"ndslabs-dataverse"
            }
         }
      }
   ]
}

pdurbin added a commit that referenced this issue Sep 15, 2017

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 15, 2017

@danmcp thanks! Added in 4702e0a. Under "Applications" there are now entries under "Deployments" and "Pods" which seems like great progress, but I'm getting this in the log:

--> Scaling dataverse-glassfish-1 to 1
--> Waiting up to 2m0s for pods in rc dataverse-glassfish-1 to become ready error: update acceptor rejected dataverse-glassfish-1: pods for rc "dataverse-glassfish-1" took longer than 120 seconds to become ready

Here's a screenshot:

screen shot 2017-09-15 at 7 38 34 pm

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 15, 2017

Scratch that. I tried again and now I'm getting this:

Using Rserve at localhost:6311
Optional service Rserve not running.
Using Postgres at localhost:5432
Required service Postgres not running. Have you started the required services?

This error seems to be coming from https://github.com/nds-org/ndslabs-dataverse/blob/9ddc9efa54185ffd69e25487159a09c4bb2e56bf/dockerfiles/dataverse/entrypoint.sh#L69

https://github.com/nds-org/ndslabs-dataverse/blob/9ddc9efa54185ffd69e25487159a09c4bb2e56bf/dockerfiles/README.md#starting-dataverse-under-docker has some nice information about how you have to start PostgreSQL and Solr before starting Dataverse, which makes sense.

@danmcp

This comment has been minimized.

Copy link
Contributor

danmcp commented Sep 16, 2017

@pdurbin Similar to this example:

https://github.com/openshift/origin/blob/master/examples/sample-app/application-template-stibuild.json

You're going to want to add postgres and solr to the same template. And have env vars generated to connect them all together.

pdurbin added a commit that referenced this issue Sep 18, 2017

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Sep 18, 2017

@danmcp thanks, I made some progress, I think, by adding the centos/postgresql-94-centos7 image in f41d753 but you're welcome to let me know if I'm doing something wrong. I assume I'll still need to mess with the postgres user, password and database values, but the port must be open now because in console it got past the postgres check and is now failing on Solr, which I guess I'll work on next:

Using Rserve at localhost:6311
Optional service Rserve not running.
Using Postgres at localhost:5432
Postgres running
Using Solr at localhost:8983
Required service Solr not running. Have you started the required services?

@pdurbin pdurbin self-assigned this Mar 8, 2018

pdurbin added a commit that referenced this issue Mar 9, 2018

pdurbin added a commit that referenced this issue Mar 9, 2018

pdurbin added a commit that referenced this issue Mar 12, 2018

pdurbin added a commit that referenced this issue Mar 12, 2018

Merge pull request #4500 from IQSS/4040-dev-guide
doc improvements for using OpenShift and Minishift #4040
@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Mar 12, 2018

I had a nice meeting on Friday with @danmcp @DirectXMan12 @patrickdillon @MichaelClifford Ashwin and Ryan (sorry, I don't know your GitHub username). I merged pull request #4501 from @danmcp to fix up our OpenShift config.

Based on feedback in that meeting I made some improvements to the OpenShift and Docker sections of the dev guide in pull request #4500. Heads up that as part of #4419 I'm moving that content to a dedicated page (see 6bee8d1 for example).

@pdurbin pdurbin removed their assignment Mar 12, 2018

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Mar 14, 2018

@patrickdillon discovered that on the "develop" branch (I just tested 5ed5edf), when you create a dataverse it is not indexed into Solr (thanks!). The UI doesn't show the facets (screenshot attached) and in server.log we see errors like this:

<h2>HTTP ERROR: 404</h2>
<p>Problem accessing /solr/update. Reason:
<pre>    Not Found</pre></p>

<h2>HTTP ERROR: 404</h2>
<p>Problem accessing /solr/spell. Reason:
<pre>    Not Found</pre></p>

To be honest, I don't remember if indexing ever worked in the OpenShift environment. The main way I've been testing is by logging in. Here's a screenshot of how the dataverse I just created isn't indexed:

screen shot 2018-03-14 at 5 49 38 pm

2018-03-21 Update: The screenshot above is actually a bad example because it's expected that a dataverse you just created doesn't have any children. However, I re-tested this yesterday and it really is broken. If you navigate to the root, nothing shows as being indexed. I'm hoping to fix this as part of the upgrade to Solr 7 in #4158.

pdurbin added a commit that referenced this issue Mar 22, 2018

restart glassfish to pick up solr server change #4158 #4040
Also add more error checking to build.sh

Also track default.config used in minishift/openshift.
@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Mar 22, 2018

Ok, I just tweaked our openshift config in 493badf an got Solr working in that branch/pull request, which hasn't been merged yet. The tag on DockerHub is called "4158-update-solr" if anyone wants to try it out. This is the branch where we're upgrading from Solr 4 to Solr 7 so when it gets merged, we'll need to push new images to the "latest" tag on Docker Hub.

I should note that because I was struggling so mightily with getting Solr 7 working in openshift, I sort of gave up and started running it in /tmp over at 94786ff . At some point I'll work with @danmcp or @DirectXMan12 or some other OpenShift guru to either make this right or more preferably, move to a standard openshift-compatible image that we don't have to build ourselves, like we do with postgres (we use postgres image from centos). This one from @dudash might be a candidate but I haven't tried it yet: https://github.com/dudash/openshift-docker-solr

Anyway, the other fix was to restart Glassfish to pick up the change to the :SolrHostColonPort setting. I'd consider this a bug in the installer that we should fix. Most people who install Dataverse don't run into this because they install everything on a single server.

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Apr 3, 2018

Pull request #4520 was merged yesterday and I just ran build.sh in conf/docker to push new images to the "latest" tag:

This means that Solr has been upgraded to Solr 7 in that image and the Dataverse war file has been updated to a version (commit 037cb9c) that's compatible with it. Again, there's technical debt in that Solr image (it's running out of /tmp but these images are not intended for production use at this time. As I mentioned before, we should investigate switching to https://github.com/dudash/openshift-docker-solr or some other Solr image that already runs well on OpenShift.

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented May 2, 2018

Just a quick note to say that I just pushed images to Docker Hub as of 639715d which includes the following changes:

  • pull request #4598 Add Postgresql Statefulsets with Replication to OpenShift/Kubernetes
  • pull request #4617 Add Glassfish Statefulsets to OpenShift/Kubernetes
  • pull request #4621 Solr 4.3.0 upgrade

I did a simple test of creating a dataverse and making sure that it's indexed. It seems fine.

Yesterday I went to the final demo of the stateful sets work (two of the pull requests above) that was contributed by the BU students @danmcp @DirectXMan12 and I have been mentoring all semester. I highly recommend watching their final video at https://github.com/BU-NU-CLOUD-SP18/Dataverse-Scaling#our-project-video which explains what they were up to. We're still a long way from having a production-ready environment on OpenShift for running Dataverse, but these stateful sets will help us scale Glassfish and Postgres independently in the future. As a bonus, since the project included some load testing, check out JMeter script that have been added to #4201. A huge THANK YOU to these students for all of their hard work: Patrick Dillon, Michael Clifford, Ashwin Pillai, and Ryan Morano. See also the thread at https://groups.google.com/d/msg/dataverse-community/TSxf4MTYYjg/7VJB_-GJBAAJ

On a related note, another group of BU students in the same class worked on a project related to Dataverse and OpenShift. See the "Spark and Dataverse (Big Data Containers, computation)" thread for more: https://groups.google.com/d/msg/dataverse-community/P4llZSssZ2Q/zvhGltLpAQAJ . Thank you to them as well!

@ferrys

This comment has been minimized.

Copy link
Contributor

ferrys commented May 3, 2018

Just watched the final video and it was super cool to see this in action! You guys did a great job and it is awesome to see Dataverse being able to take the steps towards being fully scalable.

@xibriz

This comment has been minimized.

Copy link
Contributor

xibriz commented May 8, 2018

This is good news and the work is in the same direction as UiT's goals for Dataverse 👍

@pdurbin

This comment has been minimized.

Copy link
Member Author

pdurbin commented Nov 13, 2018

Given that there are 96 hidden comments on this issue...

screen shot 2018-11-13 at 9 44 07 am

... it's probably time for a fresh one. When we pick up this work again let's open a new issue and link to it from here. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.