Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create graphs of dashboard data #290

Closed
ruebot opened this issue Apr 25, 2019 · 18 comments

Comments

Projects
None yet
4 participants
@ruebot
Copy link
Member

commented Apr 25, 2019

We have some info that'd be nice to graph out (or I've just been looking at the AWS Lamda and CloudWatch console too much lately) on a dashboard page.

Maybe:

  • Download speeds over time (We have start time, end time of a download job along with collection size)
  • Spark throughput over time
  • Graphpass throughput over time
  • Text filter throughput over time
  • Seed jobs throughput over time
  • Cleanup throughput over time
  • New accounts over time (I think we have data for that)
@ruebot

This comment has been minimized.

Copy link
Member Author

commented Apr 30, 2019

@SamFritz @ianmilligan1 let me know what you think of this:
Screenshot_2019-04-29 Graphs AUK Dashboard Archives Unleashed(1)

That should be all the bullet points above, plus jobs run over time.

I'm using Chartkick.

  • Users is an area chart (by day - I'd probably change this to month)
  • Jobs is a multiple series* (by month)
  • The rest are a scatter plot

I'm at York tomorrow, so no rush.

@ruebot ruebot self-assigned this Apr 30, 2019

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Apr 30, 2019

(Oh, you check out that Chartkick page, you'll see the option to download the charts as images. I set that up for each of them, so y'all can use any of those in presentations in the future 😉 )

@SamFritz

This comment has been minimized.

Copy link
Member

commented Apr 30, 2019

This is looking great @ruebot! Really like how the images are easily downloadable for additional use :)

Just wondering for graphs that have an access labeled 2018.0; 2018.1; 2018.2, etc, does the decimal point indicate the day of the year. e.g 2018.3 = January 3, 2018?

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Apr 30, 2019

Screenshot_2019-04-30 Graphs AUK Dashboard Archives Unleashed

Throughput graphs updated:

  • x-axis: id (just an auto-incrementing number in the job dashboard table)
  • y-axis: job time in minutes
  • First 4 throughput graphs are: scatter, area, line, column

Let me know which of those 4 you like the best, then we can work on axis labels.

@lintool

This comment has been minimized.

Copy link
Member

commented Apr 30, 2019

"jobs" should have smoothing; in fact, none of the graphs should.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Apr 30, 2019

@lintool should or shouldn't?

@lintool

This comment has been minimized.

Copy link
Member

commented Apr 30, 2019

Sorry - shouldn't

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Apr 30, 2019

Screenshot_2019-04-30 Graphs AUK Dashboard Archives Unleashed(1)

Curves/smoothing removed.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Apr 30, 2019

Also, "throughput" probably isn't the best word to use here for all the "throughput" charges. Those are all just showing how long each job was.

This is the method that basically powers all of them:

  def job_times(queue_name)
    jt = Dashboard.where(queue: queue_name)
                  .where('end_time is not null')
                  .pluck(:end_time, :start_time, :id)

    jt.map { |k, v, i|
      Hash[i, TimeDifference.between(k, v).in_minutes]
    }.inject(:merge)
  end
@ianmilligan1

This comment has been minimized.

Copy link
Member

commented Apr 30, 2019

These look great, @ruebot – esp. without curves/smoothing. At a glance it gives a fantastic overview of where we're at with our different processes and elements, which is what we like dashboards to do.

As for throughput graph style, I would have a preference for area or line. Scatter requires more effort to read and column is a bit boxy.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented May 1, 2019

Ok, I'm going to commit, and push this up to production here in a moment. Then we can do further feedback from there as y'all will be able to interact with it will production data.

ruebot added a commit that referenced this issue May 1, 2019

Add graphs and re-org dashboards; address #290.
- Added chartkick gem
- Updated routes for dashboard re-org
- Added stats page (buttons on old dashboard)
- Added jobs page (table on old dashboard)
- Added graphs, and helper method
- Update rubocop config
- Removed pagination from dashboards model
@ruebot

This comment has been minimized.

Copy link
Member Author

commented May 1, 2019

Updated! Check it out, and drop feedback in here.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented May 1, 2019

Caught a couple things that might be helpful:

  • Title of stats page isn't consistent (updated locally)
  • Added id column to jobs table, so you can see which job it is from the graph.
@ruebot

This comment has been minimized.

Copy link
Member Author

commented May 1, 2019

Maybe "User Registrations" graph should go by month, not day.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented May 1, 2019

  • Rename all "throughput" graphs to "<queue_name> Job Length"
  • Add actual throughput graphs for each queue; job time over collection size (GB)
@ianmilligan1

This comment has been minimized.

Copy link
Member

commented May 1, 2019

These look fantastic, @ruebot - as you'll see from the Slack I was nerding out and excitedly screenshooting.

A few thoughts:

Maybe "User Registrations" graph should go by month, not day.

That's probably most useful. When we have clusters of sign ups, we can notice it on the jobs dashboard, whereas the "user registrations" I think could give us good long-term data?

Rename all "throughput" graphs to "<queue_name> Job Length"

Good call – that's probably more understandable at a glance.

Rename all "throughput" graphs to "<queue_name> Job Length"

👍 that'd be really useful!

ruebot added a commit that referenced this issue May 1, 2019

Implemented throughput charts (not job length); addresses #290.
- Renamed all "throughput" charts to "job length"
- Implemented throughput charts
- Updated user registrations chart to group by month instead of day
- Added "id" column to jobs table
- Updated titles of stats page
- Added routes for all new data feeds for charts

ruebot added a commit that referenced this issue May 1, 2019

Removed seed throughput charts; addresses #290.
- Seed throughput doesn't go by collection.
- Removed chart, controller method, and route
@ianmilligan1

This comment has been minimized.

Copy link
Member

commented May 1, 2019

New charts look great! Tooltips work really well, the user registrations per month are very handy and good to see our long-term growth, and the throughput and job lengths are legible and really useful to me.

ruebot added a commit that referenced this issue May 2, 2019

Added user ratio pie chart; address #290.
- Added user ratio chart controller methods
- Added route for user ratio pie chart data
- Updated rubocop config
@ruebot

This comment has been minimized.

Copy link
Member Author

commented May 2, 2019

Resolved with:

@ruebot ruebot closed this May 2, 2019

ruebot added a commit that referenced this issue May 2, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.