Justification
Cast of Characters
Data Storage Decision Factors
Relational Databases
Object Databases
Object-Relational Mapping
DBM Databases
In-Memory Databases
Images
Flat Files
ISAM
Resources
This talk was presented to the NYC CTO Club on April 14th, 2004.
Update: Database War Stories is an excellent series of articles describing real-life problems and solutions. The link is to the last article, which contains links to the other articles in the series.
Agenda
- Explore non-relational data storage
- Discuss advantages and disadvantages
- Hear stories
- Collect questions; report back to group
These online notes will be updated based upon feedback received during the presentation.
Justification
"As far as I've seen, once those XML worms eat into your brain, it's hard to ever get anything practical done again. To an XML person, every nail looks like a thumb. Or something like that."
– fejj on advogato.org
Know your options and your tools.
Update: Astute reader Curt Sampson points out, "[T]he relational model is in no way a data storage technique, but is a data modelling technique. It's entirely orthogonal to something such as in-memory storage versus disk storage, for both relational and non-relational models, you have the choice of in-memory or not-in-memory storage."
Curt is, of course, entirely correct. I mean the term "data storage" not in its technical sense but in the sense of a broad approach to modeling, storing, and retrieving data. The talk was given to a room full of CTOs and senior-level technology managers, not database architects. I opened with an informal introduction that hopefully explained what I meant.
Cast of Characters
- Data Storage Technologies
- Relational databases
- Object databases
- Object-Relational mapping
- DBM databases
- In-memory databases
- Images
- Flat files
- ISAM
- xBase
- Gerbils with crayons
- Me
Data Storage Decision Factors
Reasons for storing data
- Short-term work: sessions, state management, transactions
- Long-term persistence
- After-the-fact analysis
- Record keeping
- Rules-based behavior; flexible apps
- Searching
- Exchange with other people and processes
Each reason has its own requirements, be it speed, flexibility, or reliability.
Technology Factors
What to keep in mind when selecting a data storage technology. What is good enough?
- Speed, reading and writing
- Longevity of data
- Reliability
- Is it OK to drop writes on the floor?
- Is failover necessary, or even possible?
- Querying
- Ad-hoc
- Stable, known in advance
- Relational (set theoretical)
- Reporting
- Ad-hoc
- Stable, known in advance
- Do you need two data stores?
- Transactions
- Reporting
- Accessibility by applications, external and internal systems
- Maintainability
- Staffing
- Could technology or company become defunct?
- ACID (Atomicity, Consistency, Isolation, Durability)
- Domain model and its mapping to
- Data storage format (hierarchical, key/value, relational)
- Data access method (tree walk, key lookup, SQL query)
- Platform availability (language, OS, hardware)
- Data store system availability (hand-held device)
- Where does business logic belong?
Relational Databases
- Based on set theory
- SQL
- General purpose
- Mostly harmless...er...portable
- Good for ad-hoc queries
- Not a complete programming language
- Binary data handling awkward
- Library/language issue
- Marry database with file system
- Relational Online Analytical Processing (ROLAP) vs. Multi-dimensional Online Analytical Processing(MOLAP)
- Variations on a theme
- Normalized vs. multi-dimensional denormalized
- InterSystems Caché "post-relational" database (really an object database with SQL and multidimensional data access)
When to use
- Ad hoc queries important
- Data fits relational model
- You already have one
When not to use
- Raw speed is a factor
- Embedded devices
- High-traffic systems (DNS, telecom)
- Becomes necessary to write caching code in your application or tweak the database installation to wring every ounce of performance out of it
- Space is a factor
- Embedded devices
- Embedding database in an application
- Complex relational queries not necessary
- Web logs
- High scores
- Cookies
- Don't mind dropping a transaction or two
- Doesn't fit programming model
- Objects and arbitrary graphs don't always map naturally to tuples
- Synchronization between database and objects in memory
- If you fetch the same database record twice, do you get the same object?
- If you fetch an object, do the objects it contains get fetched?
- Should your data be in structures, not objects?
- You find yourself writing the same query twice:
- Once in SQL to retrieve the data
- Once in your programming language to display it
Object Databases
- Store graph of objects
- Handle inheritance and circular references
- Generally tied to a particular OO language, though can have adaptors for other languages
- Efficiently traverse relationships
- Don't need to worry about whether an object is in memory or not
- Handling schema changes can be difficult (see discussion in In-Memory Databases below)
- Let's Use an Object Oriented Database from the WikiWikiWeb.
A few names
Many OO databases available, both free and proprietary
- ObjectStore
- Polyhedra
- Poet
- Gemstone (Smalltalk, Java)
- MetaKit (for embedded systems)
- GOODS (distributed, language-neutral; clients for C++, Java, Perl, Smalltalk)
When to use
- You want to avoid O/R mapping time and effort
- You don't want to write the same query twice
- You feel the need for speed
- It's objects all the way down
When not to use
- Ad-hoc queries
- I didn't research OO database reporting tools
- You can always peel data off into a warehouse
- Crystal Reports does have a plug-in for OO databases
- Need multi-language access
- Schema changes frequently
Stories
All emphasis is mine and is used purely to draw your eye to the interesting bits.
Success
"For the past 5 years or so, the project I work on has been using an Object-Oriented database (Versant, storing Java objects). By all accounts, it has been a success for us. We are now looking at moving to JDO, keeping Versant as a choice, but being able to profile and have other database choices in deployment. JDO is very similar to the proprietary J/VI interface we have been using, but there are some slight differences."
Sotheby's
ODI's ObjectStore was in use at Sotheby's before the auction site was built. The original auction site code used ObjectStore. The numbers showed that reads were much faster than writes, though that was not the major factor in deciding not to use the system developed around it. Another problem was the need to write custom code for every query. Reporting was very difficult; there were no good tools back in '99 (though there may be now). ObjectStore is still in use at Sotheby's for other purposes.
Fast
Dan Woods used ODI at Time, Inc. They had a few problems and had to dump and reload the database. It was blazingly fast.
Fear
From Let's Use an Object Oriented Database:
"Yesterday I was in the audience for a presentation given by a Content Management System vendor. They were very pleased of their model for content, object this and meta that and inheritance the other. And it runs over Oracle. Up goes my paw: why not use an OODBMS? Chuckles all round and there came out the story of another content management outfit who did use an OODBMS, and have nearly gone bust since very few customers were prepared to have the thing in the building. Even for web content, not legacy data, people won't touch an OODBMS. Astounding." – Keith Braithwaite.
"What the hell is everyone so afraid of? GemStone has saved me lots of development work. GemStone has saved me lots of response time. I intend to continue using it for competitive advantage over those who insist on doing it the hard way." – Randy Stafford
Order of magnitude gain
From Let's Use an Object Oriented Database:
"I just wrote a GemStonej application that can switch between object persistence and relational persistence at runtime, based on a property read at startup. At present I'm seeing an order of magnitude difference in response time between the two persistence modes (object persistence being faster, of course) - and this says nothing of the extra development effort you have to go through to do O/R mapping. But in fairness we have more optimization to do on the relational side. And for a simpler domain model than we have, relational persistence may be OK." – Randy Stafford
Experiences
From Let's Use an Object Oriented Database:
"My empirical touchstone here was a comparison we conducted at AC Nielsen in 1996. We tried every OODB on the market then, about a dozen including, I seem to recall, Gemstone. We found similar performance constraints on all. Regular OS paging mechanisms just couldn't keep up with the specialized disk access techniques of the RDBMSes." – Peter Merel
"My experience is that while GemStonej is typically faster than an RDBMS and more flexible, the work required to "roll your own" indexing, searching & querying framework can be complex for some developers not accustomed to dealing with the performance problems of dealing with thousands to millions of objects.
...
"I think this comes to the abstraction principle: an RDBMS is a black box with a big label on it that says: "This is REALITY. :thud of SQL manual falling on desk: You must think inside of this box." An OODBMS says: "Here's a set of minimal rules. Now you can create the rest of your own reality."
...
"My opinion: In application areas where it pays to "think outside of the box", an OODBMS is the proper choice, assuming you work with people that can handle the power properly" – Stu Charlton
"Efficiently traversing relationships: Relational databases don't do this well because it's hard to arrange locality of storage. If I read an employee, then his department, then his office, etc, etc... I'm going to have to return to the database several times. Object databases, on the other hand, let you cluster related data together, enabling you to traverse relationships more efficiently.
"So what? Well, the GemStone application I worked on was a case management system. It's never going to be used to grind over tables and generate reports. The only thing anybody ever does with it is traverse object relationships. I seriously doubt that it would have worked even half as well with a relational database behind it." – Anthony Lander
Object-Relational Mapping
- Rationale: much current development is object oriented, much data storage is relational
- Impedance mismatch
- Object identities are not the same as primary keys
- Inheritance hierarchies don't map naturally (especially multiple inheritance, mixins, or interfaces)
- Foreign keys can only reference one table; how store a contained object
that implements some interface? What if my
Containercan storeThings, whereThingis an interface implemented byWidget,Gadget, andCustomer?
- It takes time and money to map
- Frameworks (language-specific)
- Hide SQL differences between databases
- Some tools support different approaches at the same time:
- Write mapping, generate SQL and code
- Reverse-engineer database, creating mapping and code
- Write code—perhaps with XDoclet hints— and generate mapping and SQL
- If language supports reflection or runtime manipulation, framework can hide some of the details that make persistence more transparent
- Can take care of a lot of busy work like deleting object trees
- Even a home-grown O/R framework can help a team
- Cleans up the code; JDBC code not scattered everywhere
- Uses design patterns
- Enforces single place to look for SQL
- Does not always generate the best SQL for a particular query
- Can turn relationships expressed awkwardly in RDBMS (many-to-many, for example artists and galleries) into natural OO relationships (each side has list of objects)
- Often, mapping is stored as XML or other human-readable format or is even stored in the code itself
- Some frameworks let you pre-fetch related objects
- ROE: Smalltalk code turned in to SQL
When to use
- When you use objects and must store objects in a RDBMS
- When you want to save time and effort performing the mapping
When not to use
- Need to tweak SQL a lot (not just in one place)
- Frameworks usually let you write raw SQL
Stories
All emphasis is mine and is used purely to draw your eye to the interesting bits.
Caching
Daniel Seltzer worked on a high-load Oracle installation where they could
not interject an O/R mapping. He introduced the util.concurrent
classes by Doug Lee. The cache manager uses a lookup key for each object. One
cache per JVM. He had tools to monitor the cache's performance.
Objects expired by time. This was simpler than anything else. They slightly randomized the time so that everything didn't expire at the same time.
Cayenne
I chose Cayenne for the Preclick PhotoWiki, replacing a custom O/R mapping layer. This cleaned up the code and made it much simpler to modify. It also removed lots of code from the application that worried about the persistence state of objects.
Cayenne performs object caching at three levels (all optional): between data contexts (essentially database connections), within a single JVM, and between JVMs.
Cayenne couldn't handle a few tricky relationships. I found a few bugs in Cayenne having to do with foreign keys that were almost immediately fixed by the developer. The community was quite helpful, even if the answer was, "We can't do that yet. Here's a quick workaround."
Cayenne comes with a good GUI design tool that lets you reverse engineer a database, create one from scratch, generate the Java code, generate the SQL for the database, and generate the database itself.
There are still some meaningless coding and design requirements (from object's point of view), like having to specify reverse relationships.
Cayenne is also used on the NHL's site.
DBM Databases
- Key/value pairs (hash tables, dictionaries)
- Long history in Unix
- Simple but effective API:
open()close()fetch(key)delete(key)store(key, value)first_key()next_key()
- Tons of implementations and bindings for different languages
- Sleepycat Berkeley DB
- Adds ACI (no D) and commit no-sync
- Adds real-time replication
- Implementor chooses medium for transport of replicated data
- Single master, multi slave
- Can choose how well replicated the data is (can use multicast or TCP/IP)
- Automatic failover via election; algorithm guarantees get best new replica
- Have control over everything, but intelligent defaults for everything
- Many interfaces: C, Java (new API and new Java Edition), XML/XPath, and more
- New Java Edition (in beta)
- BIND API (available both in JE and Java interface to C version of DB) handles conversion between objects and byte arrays
- Fully ACID
- In use everywhere (see stories below)
- New Berkeley DB Java Edition, a pure-Java implementation of Berkeley DB; see Building Applications with Berkeley DB Java Edition
- Implementations typically fast and small (Berkeley DB compiles down to 350K, Java Edition to 418K)
- Querying typically difficult; need to create a query layer and a bunch of secondary databases (different indexes (views) on the same data)
When to use
- Speed and size are issues
- Good for embedding into applications
- You know what questions you'll be asking of the data
When not to use
- Ad-hoc queries
Stories
Berkeley DB is in use everywhere. Sleepycat believes that 90% of people in the U.S. touch Berkeley DB every day and don't know it. Berkeley DB is used in Sendmail, Apache Web server, Jabber, Sun (LDAP directory servers), EMC (Celerra and Centera products), Electronic Clearing House, Inc., Netscape browser bookmark management, Mozilla news client, Google accounts, the largest U.S. stock market, AOL client software, RSA's Keon Secure VPN, a Sony set-top box, cable TV, Amazon.com (as a cache to speed retrieval of data), telecom industry, DNS.
I feel like a kid in one of those '50s science films: "You see, Timmy, you use plastic every day! Your dog uses plastic! It's here, in the rims of your thick black glasses!"
In-Memory Databases
- Historically, memory was scarce
- Cache some data in memory; why not all?
- Current buzzword: "prevalence"
- Not new
- Many relational databases pull tables into memory
- Prevayler getting the most press; we'll concentrate on that (and ignore the project owners' haughty attitude)
- Other examples: MetaKit, Polyhedra, TimesTen, which has been used by a few of our members
Properties
- All objects in memory all the time
- All (Prevayler) operations implemented as serializable command
- Unlike OO and relational databases, prevalence layer doesn't know about internals of system (classes, tables, objects)
- Fast: Prevayler claims queries are 3,000 - 9,000 faster than
MySQL/Oracle, even when fully cached in RAM and on the same machine
- No SQL to build
- No SQL to run
- No translation between database data and object
- Snapshots written to disk (don't need to halt system)
SnapshotPrevaylerhandles startup/shutdown housekeeping- Snapshot starts a new transaction log
- Sharing between servers happens via shared command log
- One machine is the hot system
- Fault tolerant: replicas can take over
- Similar to journaling file system
- Implementations are language-dependent
- Java (Prevayler), Smalltalk, Ruby, Python, C#, more
- Query via code, X Path, OGNL
- Though commands (transactions) are executed sequentially, need to worry about concurrency between commands and queries.
- See also
- Images a la Smalltalk and APL
- MySQL in-memory tables
- JavaSpaces
- Newton's soup
- Problems scaling the database between machines
- Load balancing is up to you
When to use
- Data fits in memory
- You need the speed
- Object model is fairly stable
When not to use
- Data won't fit in memory
- Ad-hoc queries required
- Object model is evolving
- Some changes need translation code
- Can load old classes using one class loader, then save using another
- Can use the
Seralization readObjectmethod - Can export/import XML
- There are tools being written to handle schema/version migration
- Need to share data between languages (language-specific)
- Need to share data between data stores (can always write code to save into XML or even SQL, of course)
Stories
All emphasis is mine and is used purely to draw your eye to the interesting bits.
Who uses it?
It took me a while to find links to users on the Prevayler site to Prevayler pioneers and Major companies using Prevayler. The latter list is short: Lehman Brothers, Microsoft (Ensemble Studios game company), Blizzard Entertainment, Petrobrás (Brazil's largest oil company), Vivo (large Brazilian cell phone operator), and TV Record (third largest TV network in Brazil).
Too good to be true?
Jon Tirsen wrote: "The morning after (I stumbled in late and hungover actually) my pair-programming mate had read all about it at the Prevayler site I left open in the browser and already ript[sic] out our then OJB-persistence layer and replaced it with Prevayler. Later that day we experienced the sensational 10k performance gain and we're probably never going back again. (Remember our system was very performant[sic] from the start.)"
Images
- Canonical example: Smalltalk
- Form of in-memory database
- Images (snapshots) include everything: objects, code, data
- Can share data between applications (trivially in the same image)
- Can change classes and instances
- Replication difficult
- There are classes for relational database I/O; can always use file system
- Object serialization
- Language dependent
- How load balance?
When to use
- Want a flexible, dynamic environment
When not to use
- Need to play nice with others outside the image
- RPC, SOAP, sockets to the rescue
- Smalltalk is good enough for Wall Street
Stories
Smalltalk has been in use in both research and corporate environments for decades. The Smalltalk community always pointed towards Wall Street when asked where it was used. XP was invented for a Smalltalk project.
Note: need to beef up this section.
Flat Files
- File system
- Of course, most databases use file system
- Tools, tools, tools!
- grep, cat, Perl
- Text indexing and searching (for example, Lucene and Webglimpse/Glimpse)
- File-level locking easy; record-level harder
- Record structure
- Line-level: fixed fields, tab- or comma-delimited
- File-level: XML, YAML, free-form text
- Intra-file level: ISAM (data files and index files)
- Random querying depend upon data structure and possible pre-indexing
- XML
- A return to the hierarchical data model; classic example is IBM's IMS (Information Management System)
- Random querying can be difficult if using XPath or DOM
- XML-based database systems; see XML Database Products; includes lists of XML databases, Query Engines and more.
- Some RDBMS and OODBs have XML I/O layers
- XQuery XML Query language; see What is XQuery? and A quick intro to XQuery for introductions
- YAML (YAML Ain't Markup Language);
text-based structured data storage
--- name: Dirk Hockeybranch date: 2004-04-14 minutes spent: - 1.02 - 1.34 - 0.7 - 0.89 - 0.94
- numbers, strings, dates, text, arrays, hash maps
- Entity-like references avoid repetition
- Can add MIME-like data types to central registry
- Human readable, machine parseable
- Convenient for "just do it" approaches with simple data
- Easy to implement
- Implementations for Perl, Ruby, Python, PHP, OCaml, JavaScript; Java coming
- Inverted files
- File organized into records of a single field
- Entire file is both data and index
- Selection string creates a boolean used to select other fields from the file
- Will handle "ridiculously huge" amounts of data with "miraculous performance"; very efficient for static data
- Significantly better performance than multi-tier app with relational database
- Weaknesses
- Adding or deleting a single record requires a complete file reorganization
- Not relational
- Not mainstream or a vendor-supported product
- All data must be perfectly de-normalized for best performance
- Inverted files date back to the 1970's
When to use
- You want data portability
- You need speed
- The Pragmatic Programmers say in their wonderful book The Pragmatic Programmer, "20. Keep Knowledge in Plain Text. Plain text won't become obsolete. It helps leverage your work and simplifies debugging and testing." Of course, they don't mean all data should be stored in plain text.
When not to use
- When mixing binary and text data in the same record
- When you need record-level locking
Stories
Speedier Terabytes
"One of my last employers, XXXX, may have a story for you. Their background – a database marketing agency who provides direct mail and email leads to catalogers, telemarketers, mailers, or any marketing organization... Their database was huge – several terabytes. It was comprised of Experian's Gold File and several other consumer files, equally[sic] two hundred million consumer records.
"Originally the data was stored in relational tables. However, generating counts was a lengthy process in terms of time and computing resources. A decision was made to transform the database into one huge flat file (yep, start gasping for air) and index it using a product called Omnidex. Then a custom app was developed to generated counts and voila, we had a faster, more reliable and cheaper to operate database.
"I had come into the scene after the conversion had occurred. Being used to large scale ERP and DW implementations, I flipped when I heard what was happening. The CIO was able to convince me on the performance gains they made under the arrangement. It works. Furthermore, there was no negative impact on other data processing, such as ETL (we used Ab Initio)."
Ad Campaign Data
Steve Jacobs wrote: "At Poindexter we have built our ad server without a traditional database. We were a small scrappy company for a while with limited resources and the cost of scaling a database infrastructure to support billions of ads served per month was prohibitive. There are 3 areas where a database is needed: data on what ads are in an advertiser's campaign, data we've gathered about the user (what ads they've seen, for example) and reporting data about what ads have been shown.
"For campaign data we take snapshots of the Oracle database that our web based campaign management tool connects to and push the most recent changes to the ad servers. This used to be in the form of XSL documents that encoded both the data and the targeting required for the campaign. The user data was a XML document stored in the users cookie. Serving an ad was simply by performing a transformation on the XML cookie. When an ad was chosen, the reporting information was written out to flat files. These flat files would then be loaded into an Oracle database on a nightly basis for reporting purposes.
"Eventually the cost of doing the transformation with the style sheets became prohibitive – we just couldn't get transactions per second we needed with XML/XSL. We've since migrated to actual Java code generation to replace the XSL where the Java code reads the users cookie. The cookie itself is no longer in XML either; we've switched to a proprietary format that is faster to parse for use by the Java code. When I first suggested to my architect that we generate Java code rather than XSL I prefaced it by saying, 'I can't believe I'm about to suggest this, but...' Nonetheless, it's worked out well, giving us a 3x performance improvement overall.
"I don't like the fact that we've mixed data and code together in one big stew. The benefit is that there's no parsing XML trees or the like – the data is all there built in to the code. The drawback is that changes to the code may end up meaning we have to regenerate the code/data for all campaigns."
Because It's There
Daniel Seltzer was on a project at a non-profit organization a few years ago. They lacked the development process, system admin, and money for traditional IT services. They ended up using the file system as the data store. Daniel developed a publishing tool in Ant. There was a small amount of metadata in a database. An Ant task periodically synched the database with the file system.
They indexed the text with Lucene.
Sync Sort
Mark Uhrmacher has used SyncSort (I'm checking the URL) on flat files with billions of transactions.
ISAM
- Indexed Sequential Access Method; from IBM
- Data stored sequentially; separate index eases random access
- Multiple indices for multiple orderings.
- Orthogonal to relational storage; relational can be implemented using ISAM
The xBASE family of databases are structured similar to ISAM, with data files and index files. Applications read the files directly.
Stories
"When adding records with ascending adjacent keys, the add process would get slower and slower because overflow records were kept in an unblocked linked list and the next record had to be added to the end of the list.
"As a system programmer, I was called about twice a year because 'the computer is running slow'. The fix is to add the records in descending key sequence so each record is added at the head of the linked list. It was fun at first being hailed as a genius. But as Yogi Berra said, it was deja vu all over again."
Resources
This list of resources is in addition to those presented inline. I need to flesh out this list.
- The Portland Pattern Repository and WikiWikiWeb top level pages are deceptively sparse. Browse the main Category page. Search for programming terms and you'll come across an amazingly deep ocean of discussion. For example, search for page titles including "relational" (49 pages) or "persist" (28 pages)
- Christopher Browne's Web Pages. Section VIII, Business Oriented Computing, contains a number of good pages about various database types and implementations.