Thursday, May 12, 2016

Intership post #2

Week 19 notes

0.1 Week 19, 2016

Happily it’s not week 19 of the internship, just 2016. However, it’s already week 10 of 34!

thisweek <- 10
yikes <- c(thisweek, 34-thisweek) / 34
barplot(as.matrix(yikes), horiz=TRUE, beside=FALSE)

Last week I got all our current scores loaded into a local postgres database; poked through a shiny tutorial; switched from RPostgres driver to dplyr (for better or worse); and corrected a first CRPS plot to make a little sense.

It’s true, sometimes this geologist struggles to grok the model scores in forecasting!

This week I need to:

  • decide if the SOS database format is the way to go forward,
    • which makes maintenance easier
    • but dplyr less useful
    • and I’d learn curl
  • load data to AWS instance
    • use .Renviron to point at dev / prod databases
  • mysteries to solve
    • where does dplyr disconnect pooled connections?
    • strategies for a multi-user app?
  • R fundamentals I still need
    • difference btw filter() and subset?
    • ggplot2
    • facet()
  • Some helpful r debugging links:

This week so far I have done:

  • modified local db to match v2 specifics
  • built interactive dataframe to postgres db,
  • built csv upload function which loads SMHI files successfully
  • finished main score series viewer

This first-pass structure worked for simple data off all one datatype…

ERD v0.1

… but in fact we are scoring many variables which need to be tied together more explicitely. Hence, version 2:

ERD v0.2

We need a structure for the scores to import - currently receiving text files and 3D “cubes” depending on source… tidying takes time; should be automated so users may load / arrange their scores (like EVS).

Working with “reactive” call today:

Something to look into on my time – confidence Intervals discussed in different context: …with nifty Shiny app to illustrate Figs 1 - 5 from article:

No comments:

Post a Comment