Sunday, March 22, 2015

typesafe slick and config for data processing applications

I use slick 3.0.0 in my applications. A big improvement in 3.0 is the introduction of database actions that allow you to decouple the creation of a "query" from running it using an API that allows you to more easily choose async/sync patterns. It also includes a streaming interface in order to bound memory usage.
My applications are often a bit smaller than a large enterprise application. I typically load data into a database and then pull it back out for analysis. Slick makes this easy. Here's how I typically setup my application.
I create layers for the database module. The layers include:
  • a basic config for the profile and database
  • a schema layer
  • a queries layer
  • other layers that are application design/architecture specific e.g. integrating db calls into a stream based processing library
Here's what they look like for a hypothetical application that needs to work with application "events":
trait HasDBConfig[P <: JdbcProfile] {
  val config: slick.backend.DatabaseConfig[P]
}

trait AppSchemas[P <: JdbcProfile] {
  self: HasDBConfig[P] =>

  def entityName(name: String) = name.toUpperCase()
  type ID = Long
  import config.driver.api._

  implicit val MyTimestampTypeMapper =
    MappedColumnType.base[LocalDateTime, java.sql.Timestamp](
      java.sql.Timestamp.valueOf(_),
      _.toLocalDateTime)

  class AppEvents(tag: Tag) extends Table[AppEvent](tag: Tag, entityName("AppEvents")) {
    def id = column[ID]("ID", O.AutoInc, O.PrimaryKey)
    def eventId = column[Int]("EVENTID")
    def message = column[Option[String]]("MESSAGE")

    def * = (id.?, eventId, message) <> (AppEvent.tupled, AppEvent.unapply)
  }

trait LogQueries[P <: JdbcProfile] {
  self: AppSchemas[P] with HasDBConfig[P] =>

  import slick.dbio.DBIO
  import config.driver.api._
  import config.driver.DriverAction

  lazy val Events = TableQuery[AppEvents]

  def delete(events: Seq[AppEvent]) =
    Events.filter(e => e.id inSet events.map(_.id).flatten[ID].toSet).delete

....more queries here....
}

case class DataModule[P <: JdbcProfile](val config: DatabaseConfig[P]) extends AppSchemas[P] with MyQueries[P] with HasDBConfig[P]
I use a case class for DataModule to make instantiation easy and make sure that any args that are needed in the traits are available immediately. I also used self types above but you can use inheritance.
My apps have a fixed number of databases they can use. I'll usually define a class that has a val with the proper data module:
case class DatabaseModules(val configsource: Config = ConfigFactory.load()) {

  if (!configsource.hasPath("driver"))
    throw new IllegalArgumentException("Invalid configuration. No driver property specified. Use JVM option -Dconfig.trace=loads to view config settings")

  val DatabaseAccess = configsource.getString("driver") match {
    case "slick.driver.H2Driver$" =>
      DataModule(DatabaseConfig.forConfig[slick.driver.H2Driver]("", configsource))
    case "slick.driver.MysqlDriver$" =>
      DataModule(DatabaseConfig.forConfig[slick.driver.MySQLDriver]("", configsource))
    case "slick.driver.PostgresDriver$" =>
      DataModule(DatabaseConfig.forConfig[slick.driver.PostgresDriver]("", configsource))
    case x@_ =>
      throw new IllegalArgumentException(s"Invalid configuration. Unknown driver: $x. Use JVM -Dconfig.trace=loads to view config settings.")
  }
}
I do not need fancy user messages for the app so I'll just take the exception. I believe an exception is appropriate here because if I returned an Option or Try I would have to have the rest of my application test for it. Since I do not want my application to continue if there is a problem with the configuration, this works for me.
The DatabaseModules class takes a config. My config looks like:
driver = "slick.driver.H2Driver$"
db {
  url = "jdbc:h2:tcp://localhost/~/dbs/appeventsdb;MULTI_THREADED=1"
  password = sa
  user = sa
  connectionTimeout=2000
}
In my application, you can use the data module:
    val config = ConfigFactory.parseFile(Paths.get("appevents.conf").toFile).withFallback(ConfigFactory.load())
    val DataModule = DatabaseModules(config).DatabaseAccess
    val moreDbStuff = SomeDBClassWithStuffInIt(DataModule)

    import DataModule._
    import DataModule.config.driver.api._
    import DataModule.config.db
    ...
    import moreDbStuff._
That's about it. You could easily extend this to allow prod/dev/test type configurations. Note that there are some issues in slick 3.0.0 RC1 that require the db sub-object for the moment.

No comments:

Post a Comment