Alternate Data Storage Technologies

When the Relational Hammer Whacks Your Thumb

Agenda
Justification
Cast of Characters
Data Storage Decision Factors
Relational Databases
Object Databases
Object-Relational Mapping
DBM Databases
In-Memory Databases
Images
Flat Files
ISAM
Resources

This talk was presented to the NYC CTO Club on April 14th, 2004.

Update: Database War Stories is an excellent series of articles describing real-life problems and solutions. The link is to the last article, which contains links to the other articles in the series.

Agenda

Explore non-relational data storage
Discuss advantages and disadvantages
Hear stories
Collect questions; report back to group

These online notes will be updated based upon feedback received during the presentation.

Justification

"As far as I've seen, once those XML worms eat into your brain, it's hard to ever get anything practical done again. To an XML person, every nail looks like a thumb. Or something like that."
– fejj on advogato.org

Know your options and your tools.

Update: Astute reader Curt Sampson points out, "[T]he relational model is in no way a data storage technique, but is a data modelling technique. It's entirely orthogonal to something such as in-memory storage versus disk storage, for both relational and non-relational models, you have the choice of in-memory or not-in-memory storage."

Curt is, of course, entirely correct. I mean the term "data storage" not in its technical sense but in the sense of a broad approach to modeling, storing, and retrieving data. The talk was given to a room full of CTOs and senior-level technology managers, not database architects. I opened with an informal introduction that hopefully explained what I meant.

Cast of Characters

Data Storage Technologies

Relational databases
Object databases
Object-Relational mapping
DBM databases
In-memory databases
Images
Flat files
ISAM
xBase
Gerbils with crayons

Data Storage Decision Factors

Reasons for storing data

Short-term work: sessions, state management, transactions
Long-term persistence
After-the-fact analysis
Record keeping
Rules-based behavior; flexible apps
Searching
Exchange with other people and processes

Each reason has its own requirements, be it speed, flexibility, or reliability.

Technology Factors

What to keep in mind when selecting a data storage technology. What is good enough?

Speed, reading and writing
Longevity of data
Reliability
- Is it OK to drop writes on the floor?
- Is failover necessary, or even possible?
Querying
- Ad-hoc
- Stable, known in advance
- Relational (set theoretical)
Reporting
- Ad-hoc
- Stable, known in advance
Do you need two data stores?
- Transactions
- Reporting
Do you perform ad-hoc queries on both types of data?
Accessibility by applications, external and internal systems
Maintainability
- Staffing
- Could technology or company become defunct?
ACID (Atomicity, Consistency, Isolation, Durability)
Domain model and its mapping to
- Data storage format (hierarchical, key/value, relational)
- Data access method (tree walk, key lookup, SQL query)
Is domain model more important, or raw data, or are they equals?
Platform availability (language, OS, hardware)
Data store system availability (hand-held device)
Where does business logic belong?

Relational Databases

Based on set theory

SQL

General purpose
Mostly harmless...er...portable
Good for ad-hoc queries
Not a complete programming language

Binary data handling awkward

Library/language issue
Marry database with file system

Relational Online Analytical Processing (ROLAP) vs. Multi-dimensional Online Analytical Processing(MOLAP)

Variations on a theme
- InterSystems Caché "post-relational" database (really an object database with SQL and multidimensional data access)

When to use

Ad hoc queries important
Data fits relational model
You already have one

When not to use

Raw speed is a factor
- Embedded devices
- High-traffic systems (DNS, telecom)
- Becomes necessary to write caching code in your application or tweak the database installation to wring every ounce of performance out of it
Space is a factor
- Embedded devices
- Embedding database in an application
Complex relational queries not necessary
- Web logs
- High scores
- Cookies
Don't mind dropping a transaction or two
Doesn't fit programming model
- Objects and arbitrary graphs don't always map naturally to tuples
- Synchronization between database and objects in memory
  - If you fetch the same database record twice, do you get the same object?
  - If you fetch an object, do the objects it contains get fetched?
- Should your data be in structures, not objects?
You find yourself writing the same query twice:
- Once in SQL to retrieve the data
- Once in your programming language to display it
Granted, the first query may filter the data. If it's a simple filter then is relational really necessary?

Object Databases

Store graph of objects

Handle inheritance and circular references

Generally tied to a particular OO language, though can have adaptors for other languages

Efficiently traverse relationships

Don't need to worry about whether an object is in memory or not

Handling schema changes can be difficult (see discussion in In-Memory Databases below)
Let's Use an Object Oriented Database from the WikiWikiWeb.

A few names

Many OO databases available, both free and proprietary

ObjectStore
Polyhedra
Poet
Gemstone (Smalltalk, Java)
MetaKit (for embedded systems)
GOODS (distributed, language-neutral; clients for C++, Java, Perl, Smalltalk)

When to use

You want to avoid O/R mapping time and effort
You don't want to write the same query twice
You feel the need for speed
It's objects all the way down

When not to use

Ad-hoc queries
- I didn't research OO database reporting tools
- You can always peel data off into a warehouse
- Crystal Reports does have a plug-in for OO databases
Need multi-language access
Schema changes frequently

Stories

All emphasis is mine and is used purely to draw your eye to the interesting bits.

Success

"For the past 5 years or so, the project I work on has been using an Object-Oriented database (Versant, storing Java objects). By all accounts, it has been a success for us. We are now looking at moving to JDO, keeping Versant as a choice, but being able to profile and have other database choices in deployment. JDO is very similar to the proprietary J/VI interface we have been using, but there are some slight differences."

Sotheby's

ODI's ObjectStore was in use at Sotheby's before the auction site was built. The original auction site code used ObjectStore. The numbers showed that reads were much faster than writes, though that was not the major factor in deciding not to use the system developed around it. Another problem was the need to write custom code for every query. Reporting was very difficult; there were no good tools back in '99 (though there may be now). ObjectStore is still in use at Sotheby's for other purposes.

Fast

Dan Woods used ODI at Time, Inc. They had a few problems and had to dump and reload the database. It was blazingly fast.

Fear

From Let's Use an Object Oriented Database:

"Yesterday I was in the audience for a presentation given by a Content Management System vendor. They were very pleased of their model for content, object this and meta that and inheritance the other. And it runs over Oracle. Up goes my paw: why not use an OODBMS? Chuckles all round and there came out the story of another content management outfit who did use an OODBMS, and have nearly gone bust since very few customers were prepared to have the thing in the building. Even for web content, not legacy data, people won't touch an OODBMS. Astounding." – Keith Braithwaite.

"What the hell is everyone so afraid of? GemStone has saved me lots of development work. GemStone has saved me lots of response time. I intend to continue using it for competitive advantage over those who insist on doing it the hard way." – Randy Stafford

Order of magnitude gain

From Let's Use an Object Oriented Database:

"I just wrote a GemStonej application that can switch between object persistence and relational persistence at runtime, based on a property read at startup. At present I'm seeing an order of magnitude difference in response time between the two persistence modes (object persistence being faster, of course) - and this says nothing of the extra development effort you have to go through to do O/R mapping. But in fairness we have more optimization to do on the relational side. And for a simpler domain model than we have, relational persistence may be OK." – Randy Stafford

Experiences

From Let's Use an Object Oriented Database:

"My empirical touchstone here was a comparison we conducted at AC Nielsen in 1996. We tried every OODB on the market then, about a dozen including, I seem to recall, Gemstone. We found similar performance constraints on all. Regular OS paging mechanisms just couldn't keep up with the specialized disk access techniques of the RDBMSes." – Peter Merel

"My experience is that while GemStonej is typically faster than an RDBMS and more flexible, the work required to "roll your own" indexing, searching & querying framework can be complex for some developers not accustomed to dealing with the performance problems of dealing with thousands to millions of objects.

...

"I think this comes to the abstraction principle: an RDBMS is a black box with a big label on it that says: "This is REALITY. :thud of SQL manual falling on desk: You must think inside of this box." An OODBMS says: "Here's a set of minimal rules. Now you can create the rest of your own reality."

...

"My opinion: In application areas where it pays to "think outside of the box", an OODBMS is the proper choice, assuming you work with people that can handle the power properly" – Stu Charlton

"Efficiently traversing relationships: Relational databases don't do this well because it's hard to arrange locality of storage. If I read an employee, then his department, then his office, etc, etc... I'm going to have to return to the database several times. Object databases, on the other hand, let you cluster related data together, enabling you to traverse relationships more efficiently.

"So what? Well, the GemStone application I worked on was a case management system. It's never going to be used to grind over tables and generate reports. The only thing anybody ever does with it is traverse object relationships. I seriously doubt that it would have worked even half as well with a relational database behind it." – Anthony Lander

Object-Relational Mapping

Rationale: much current development is object oriented, much data storage is relational

Impedance mismatch

Object identities are not the same as primary keys
Inheritance hierarchies don't map naturally (especially multiple inheritance, mixins, or interfaces)
Foreign keys can only reference one table; how store a contained object that implements some interface? What if my Container can store Things, where Thing is an interface implemented by Widget, Gadget, and Customer?

It takes time and money to map

Frameworks (language-specific)
- Java: Hibernate, Cayenne, Torque, JDO
- Ruby: Lafcadio, SDS
Hide SQL differences between databases
Some tools support different approaches at the same time:
- Write mapping, generate SQL and code
- Reverse-engineer database, creating mapping and code
- Write code—perhaps with XDoclet hints— and generate mapping and SQL
If language supports reflection or runtime manipulation, framework can hide some of the details that make persistence more transparent
Can take care of a lot of busy work like deleting object trees
Even a home-grown O/R framework can help a team
- Cleans up the code; JDBC code not scattered everywhere
- Uses design patterns
- Enforces single place to look for SQL
Does not always generate the best SQL for a particular query
Can turn relationships expressed awkwardly in RDBMS (many-to-many, for example artists and galleries) into natural OO relationships (each side has list of objects)
Often, mapping is stored as XML or other human-readable format or is even stored in the code itself
Some frameworks let you pre-fetch related objects
ROE: Smalltalk code turned in to SQL

When to use

When you use objects and must store objects in a RDBMS
When you want to save time and effort performing the mapping

When not to use

Need to tweak SQL a lot (not just in one place)
- Frameworks usually let you write raw SQL

Stories

All emphasis is mine and is used purely to draw your eye to the interesting bits.

Caching

Daniel Seltzer worked on a high-load Oracle installation where they could not interject an O/R mapping. He introduced the util.concurrent classes by Doug Lee. The cache manager uses a lookup key for each object. One cache per JVM. He had tools to monitor the cache's performance.

Objects expired by time. This was simpler than anything else. They slightly randomized the time so that everything didn't expire at the same time.

Cayenne

I chose Cayenne for the Preclick PhotoWiki, replacing a custom O/R mapping layer. This cleaned up the code and made it much simpler to modify. It also removed lots of code from the application that worried about the persistence state of objects.

Cayenne performs object caching at three levels (all optional): between data contexts (essentially database connections), within a single JVM, and between JVMs.

Cayenne couldn't handle a few tricky relationships. I found a few bugs in Cayenne having to do with foreign keys that were almost immediately fixed by the developer. The community was quite helpful, even if the answer was, "We can't do that yet. Here's a quick workaround."

Cayenne comes with a good GUI design tool that lets you reverse engineer a database, create one from scratch, generate the Java code, generate the SQL for the database, and generate the database itself.

There are still some meaningless coding and design requirements (from object's point of view), like having to specify reverse relationships.

Cayenne is also used on the NHL's site.

DBM Databases

Key/value pairs (hash tables, dictionaries)

Long history in Unix

Simple but effective API:

open()
close()
fetch(key)
delete(key)
store(key, value)
first_key()
next_key()

Tons of implementations and bindings for different languages

Sleepycat Berkeley DB
- Adds ACI (no D) and commit no-sync
- Adds real-time replication
  - Implementor chooses medium for transport of replicated data
  - Single master, multi slave
  - Can choose how well replicated the data is (can use multicast or TCP/IP)
- Automatic failover via election; algorithm guarantees get best new replica
- Have control over everything, but intelligent defaults for everything
- Many interfaces: C, Java (new API and new Java Edition), XML/XPath, and more
- New Java Edition (in beta)
  - BIND API (available both in JE and Java interface to C version of DB) handles conversion between objects and byte arrays
  - Fully ACID
- In use everywhere (see stories below)
New Berkeley DB Java Edition, a pure-Java implementation of Berkeley DB; see Building Applications with Berkeley DB Java Edition
Implementations typically fast and small (Berkeley DB compiles down to 350K, Java Edition to 418K)
Querying typically difficult; need to create a query layer and a bunch of secondary databases (different indexes (views) on the same data)

When to use

Speed and size are issues
Good for embedding into applications
You know what questions you'll be asking of the data

When not to use

Ad-hoc queries

Stories

Berkeley DB is in use everywhere. Sleepycat believes that 90% of people in the U.S. touch Berkeley DB every day and don't know it. Berkeley DB is used in Sendmail, Apache Web server, Jabber, Sun (LDAP directory servers), EMC (Celerra and Centera products), Electronic Clearing House, Inc., Netscape browser bookmark management, Mozilla news client, Google accounts, the largest U.S. stock market, AOL client software, RSA's Keon Secure VPN, a Sony set-top box, cable TV, Amazon.com (as a cache to speed retrieval of data), telecom industry, DNS.

I feel like a kid in one of those '50s science films: "You see, Timmy, you use plastic every day! Your dog uses plastic! It's here, in the rims of your thick black glasses!"

In-Memory Databases

Historically, memory was scarce

Cache some data in memory; why not all?

Current buzzword: "prevalence"

Not new

Many relational databases pull tables into memory

Prevayler getting the most press; we'll concentrate on that (and ignore the project owners' haughty attitude)
Other examples: MetaKit, Polyhedra, TimesTen, which has been used by a few of our members

Properties

All objects in memory all the time
All (Prevayler) operations implemented as serializable command
Unlike OO and relational databases, prevalence layer doesn't know about internals of system (classes, tables, objects)
Fast: Prevayler claims queries are 3,000 - 9,000 faster than MySQL/Oracle, even when fully cached in RAM and on the same machine
- No SQL to build
- No SQL to run
- No translation between database data and object
Snapshots written to disk (don't need to halt system)
- SnapshotPrevayler handles startup/shutdown housekeeping
- Snapshot starts a new transaction log
Sharing between servers happens via shared command log
- One machine is the hot system
Fault tolerant: replicas can take over
Similar to journaling file system
Implementations are language-dependent
- Java (Prevayler), Smalltalk, Ruby, Python, C#, more
Query via code, X Path, OGNL
Though commands (transactions) are executed sequentially, need to worry about concurrency between commands and queries.
See also
- Images a la Smalltalk and APL
- MySQL in-memory tables
- JavaSpaces
- Newton's soup
Problems scaling the database between machines
Load balancing is up to you

When to use

Data fits in memory
You need the speed
Object model is fairly stable

When not to use

Data won't fit in memory
Ad-hoc queries required
Object model is evolving
- Some changes need translation code
- Can load old classes using one class loader, then save using another
- Can use the Seralization readObject method
- Can export/import XML
- There are tools being written to handle schema/version migration
Need to share data between languages (language-specific)
Need to share data between data stores (can always write code to save into XML or even SQL, of course)

Stories

All emphasis is mine and is used purely to draw your eye to the interesting bits.

Who uses it?

It took me a while to find links to users on the Prevayler site to Prevayler pioneers and Major companies using Prevayler. The latter list is short: Lehman Brothers, Microsoft (Ensemble Studios game company), Blizzard Entertainment, Petrobrás (Brazil's largest oil company), Vivo (large Brazilian cell phone operator), and TV Record (third largest TV network in Brazil).

Too good to be true?

Jon Tirsen wrote: "The morning after (I stumbled in late and hungover actually) my pair-programming mate had read all about it at the Prevayler site I left open in the browser and already ript[sic] out our then OJB-persistence layer and replaced it with Prevayler. Later that day we experienced the sensational 10k performance gain and we're probably never going back again. (Remember our system was very performant[sic] from the start.)"

Images

Canonical example: Smalltalk
Form of in-memory database
Images (snapshots) include everything: objects, code, data
Can share data between applications (trivially in the same image)
Can change classes and instances
Replication difficult
There are classes for relational database I/O; can always use file system
Object serialization
Language dependent
How load balance?

When to use

Want a flexible, dynamic environment

When not to use

Need to play nice with others outside the image
- RPC, SOAP, sockets to the rescue
- Smalltalk is good enough for Wall Street

Stories

Smalltalk has been in use in both research and corporate environments for decades. The Smalltalk community always pointed towards Wall Street when asked where it was used. XP was invented for a Smalltalk project.

Note: need to beef up this section.

Flat Files

File system

Of course, most databases use file system

Tools, tools, tools!
- Text indexing and searching (for example, Lucene and Webglimpse/Glimpse)
File-level locking easy; record-level harder
Record structure
- Line-level: fixed fields, tab- or comma-delimited
- File-level: XML, YAML, free-form text
- Intra-file level: ISAM (data files and index files)
Random querying depend upon data structure and possible pre-indexing
XML
- A return to the hierarchical data model; classic example is IBM's IMS (Information Management System)
- Random querying can be difficult if using XPath or DOM
- XML-based database systems; see XML Database Products; includes lists of XML databases, Query Engines and more.
  - Apache Xindice
- Some RDBMS and OODBs have XML I/O layers
- XQuery XML Query language; see What is XQuery? and A quick intro to XQuery for introductions
YAML (YAML Ain't Markup Language); text-based structured data storage
```
---
name: Dirk Hockeybranch
date: 2004-04-14
minutes spent:
 - 1.02
 - 1.34
 - 0.7
 - 0.89
 - 0.94
```
- numbers, strings, dates, text, arrays, hash maps
- Entity-like references avoid repetition
- Can add MIME-like data types to central registry
- Human readable, machine parseable
- Convenient for "just do it" approaches with simple data
- Easy to implement
- Implementations for Perl, Ruby, Python, PHP, OCaml, JavaScript; Java coming
Inverted files
- File organized into records of a single field
- Entire file is both data and index
- Selection string creates a boolean used to select other fields from the file
- Will handle "ridiculously huge" amounts of data with "miraculous performance"; very efficient for static data
- Significantly better performance than multi-tier app with relational database
- Weaknesses
  - Adding or deleting a single record requires a complete file reorganization
  - Not relational
  - Not mainstream or a vendor-supported product
  - All data must be perfectly de-normalized for best performance
- Inverted files date back to the 1970's

When to use

You want data portability
You need speed
The Pragmatic Programmers say in their wonderful book The Pragmatic Programmer, "20. Keep Knowledge in Plain Text. Plain text won't become obsolete. It helps leverage your work and simplifies debugging and testing." Of course, they don't mean all data should be stored in plain text.

When not to use

When mixing binary and text data in the same record
When you need record-level locking

Stories

Speedier Terabytes

"One of my last employers, XXXX, may have a story for you. Their background – a database marketing agency who provides direct mail and email leads to catalogers, telemarketers, mailers, or any marketing organization... Their database was huge – several terabytes. It was comprised of Experian's Gold File and several other consumer files, equally[sic] two hundred million consumer records.

"Originally the data was stored in relational tables. However, generating counts was a lengthy process in terms of time and computing resources. A decision was made to transform the database into one huge flat file (yep, start gasping for air) and index it using a product called Omnidex. Then a custom app was developed to generated counts and voila, we had a faster, more reliable and cheaper to operate database.

"I had come into the scene after the conversion had occurred. Being used to large scale ERP and DW implementations, I flipped when I heard what was happening. The CIO was able to convince me on the performance gains they made under the arrangement. It works. Furthermore, there was no negative impact on other data processing, such as ETL (we used Ab Initio)."

Ad Campaign Data

Steve Jacobs wrote: "At Poindexter we have built our ad server without a traditional database. We were a small scrappy company for a while with limited resources and the cost of scaling a database infrastructure to support billions of ads served per month was prohibitive. There are 3 areas where a database is needed: data on what ads are in an advertiser's campaign, data we've gathered about the user (what ads they've seen, for example) and reporting data about what ads have been shown.

"For campaign data we take snapshots of the Oracle database that our web based campaign management tool connects to and push the most recent changes to the ad servers. This used to be in the form of XSL documents that encoded both the data and the targeting required for the campaign. The user data was a XML document stored in the users cookie. Serving an ad was simply by performing a transformation on the XML cookie. When an ad was chosen, the reporting information was written out to flat files. These flat files would then be loaded into an Oracle database on a nightly basis for reporting purposes.

"Eventually the cost of doing the transformation with the style sheets became prohibitive – we just couldn't get transactions per second we needed with XML/XSL. We've since migrated to actual Java code generation to replace the XSL where the Java code reads the users cookie. The cookie itself is no longer in XML either; we've switched to a proprietary format that is faster to parse for use by the Java code. When I first suggested to my architect that we generate Java code rather than XSL I prefaced it by saying, 'I can't believe I'm about to suggest this, but...' Nonetheless, it's worked out well, giving us a 3x performance improvement overall.

"I don't like the fact that we've mixed data and code together in one big stew. The benefit is that there's no parsing XML trees or the like – the data is all there built in to the code. The drawback is that changes to the code may end up meaning we have to regenerate the code/data for all campaigns."

Because It's There

Daniel Seltzer was on a project at a non-profit organization a few years ago. They lacked the development process, system admin, and money for traditional IT services. They ended up using the file system as the data store. Daniel developed a publishing tool in Ant. There was a small amount of metadata in a database. An Ant task periodically synched the database with the file system.

They indexed the text with Lucene.

Sync Sort

Mark Uhrmacher has used SyncSort (I'm checking the URL) on flat files with billions of transactions.

ISAM

Indexed Sequential Access Method; from IBM
Data stored sequentially; separate index eases random access
Multiple indices for multiple orderings.
Orthogonal to relational storage; relational can be implemented using ISAM

The xBASE family of databases are structured similar to ISAM, with data files and index files. Applications read the files directly.

Stories

"When adding records with ascending adjacent keys, the add process would get slower and slower because overflow records were kept in an unblocked linked list and the next record had to be added to the end of the list.

"As a system programmer, I was called about twice a year because 'the computer is running slow'. The fix is to add the records in descending key sequence so each record is added at the head of the linked list. It was fun at first being hailed as a genius. But as Yogi Berra said, it was deja vu all over again."

Resources

This list of resources is in addition to those presented inline. I need to flesh out this list.

The Portland Pattern Repository and WikiWikiWeb top level pages are deceptively sparse. Browse the main Category page. Search for programming terms and you'll come across an amazingly deep ocean of discussion. For example, search for page titles including "relational" (49 pages) or "persist" (28 pages)
Christopher Browne's Web Pages. Section VIII, Business Oriented Computing, contains a number of good pages about various database types and implementations.