Using Postgresql for Nosql things


(Richard Cunningham) #21

In the Innodb table type, the data is ordered (clustered) by the primary key: https://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html {user_id, datetime} was the ordering I used and worked for pretty much every query I needed - I had to denormalise so I could do it. With the default MyISAM table type, you can’t set the order on disk.


(Andy Wootton) #22

ISAM? Now we’re talking my language! :slight_smile:

One of the built-in record formats of RMS-11 record management system of the Files-11 filing system of the VAX-11/VMS operating system. And because people seem to like being stuck in a time-loop, some nutter has built a Pi powered PDP-8


Now if someone could only write a portable OS that would run on a PDP-8…


(Marc Cooper) #23

Sorry, I should have said except for the pk. That said, I mostly use uuid pks now. Not pleasant to have to denormalise for your use case. I get very twitchy about denormalisation, because I watched a company go down because of a financial miscalculation it resulted in.


(Andy Wootton) #24

I think @Rythie & @SIL explained it well. Mainstream RDBMS weren’t fast enough for the mostly read-only web and were too hard to drive, so MySQL and Microsft cut corners on the relational model and made life easy. Once people were using bad databases, they saw MySQL/‘SQL Server’ problems as RDBMS problems and needed something faster, so the NoSQL family happened as short term fixes but the things that worked are being absorbed back into Postgres. It’s the standard Unix evolutionary life-cycle. It only goes wrong when there are 2 solutions that draw, like vi and emacs or KDE and Gnome.

I started working on a VAX-11/780 in 1981. Other minis were 16 bit. 32 bits gave 2GB system address space and 2GB user space. I saw that described as “near infinite”. How many years later did the visionary Bill Gates ask why anyone would ever need more than 640K in a PC? I don’t think people have the ability to imagine beyond combining what they have already seen to be possible. There was no point having data beyond 2GB because it would take days to process and the machine would probably crash before that. The VAX replaced a Data General Ecipse, unlovingly known as ‘the Collapse’.


(Richard Cunningham) #25

Actually in the late 90s/early 2000s mainstream RDBMSes were just too expensive for many tasks (and still are), Oracle want such a lot of money, which could have never worked for free sites like Digg, Flickr, Facebook, Delicious and Last.fm in the early days.

I think MySQL was favoured over Postgres early on because much development was done on Windows PCs and Postgres didn’t support that (not added till 2010 I believe).


(Andy Wootton) #26

I’m sure you’re right but the Internet has always run on Unix/Linux so you’re blaming developers for choosing Windows? Fair enough :slight_smile:


(Marc Cooper) #27

Also, MySQL was ubiquitous with cheap web-hosting (along with PHP) in the early days. Everyone used it.

2005 on Windows according to wikipedia.


(Richard Cunningham) #28

Oh yeah 2005, I was looking at the wrong column!


(Jim Gumbley) #29

Fwiw I would shout that I’ve used this approach on numerous project and it has worked extremely well for the developers and we never lost data (unlike in adventures beyond mongoland).

Typically using the relational features of PostgreSQL when the data model requires it and JSON table liberally where a document model is required.

If a column store was what you really needed you should probably look elsewhere but PostgreSQL as a SQL/Document store (noSQL) has fwiw been pretty successful on projects I’ve led.


(Jon) #30

That’s interesting - would you say that there’s no harm in people switching to NoSQL, even when their use-case is clearly relational? I don’t know how non-nested items “join” in NoSQL, but I would have thought that complex queries in SQL is still a reason to stick with the relational model.


(Andy Wootton) #31

I’m pleased that I asked the question. I’ve done a quick search for “nosql” and apart from this thread, it was mostly jobs so it was clearly an up-and-coming technology that has come and is about to go. “it’s good to talk”.


(Richard Cunningham) #32

No, I’m not saying just use NoSQL if it doesn’t fit. If the use-case is clearly a better fit in relational DB, use that. However, be aware in relational, that schema changes need planning, possibly downtime and/or DB server replication with MySQL. As @auxbuss points out Postgres doesn’t have some of schema change issues.

Probably even with NoSQL systems, schema changes should be thought out too, because no one wants to deal with data that’s a mess. Using small bits of JSON in relational DB rows is a solution I’ve used for some data that was quite complicated to model and ever changing in schema.


(Andy Wootton) #33

What I’d taken from this whole conversation was to steal NoSQL ideas to use with Postgres first, for the best of both.

Well there’s a coincidence. See 8.14.2 http://www.postgresql.org/docs/current/static/datatype-json.html

“datums”?!


(Jon) #34

Righto, that makes sense, thanks.

I’ve seen NoSQL systems where schema changes are made, and old data is not cleaned up, creating a “jagged” schema that isn’t justified (does an item not have a property because the property does not apply, or because the property now lives somewhere else and the item was inserted after that schema change? :smile: )

Looks like JSON is coming to MySQL too!


(Andy Wootton) #35

But Oracle own MySQL now, a free tool that competes with their cash cow. ‘Buyer’ beware.


(Richard Cunningham) #36

For the most part people and distros are moving to the MariaDB fork, which ironically, even includes Oracle Linux (because it’s a RedHat clone and they moved).


(Jon) #37

This is true. However, the scare stories that caused the MySQL founder to fork the codebase and create MariaDB do not seem to have materialised, and there is excellent work going into improving MySQL (in Facebook and YouTube IIRC).


(Marc Cooper) #38

Taking it further, postgresql has its hstore type (which is basically a hash/associative array/kv store) whose content you can query against, apply indexes to, and return as JSON. So you have a searchable, schema-less “document store” right there is you want to use it that way. (You just happen to able to use SQL on it if you want to.)

One frequent use case I have is all those User model settings. I usually throw them in an hstore or two. Say, one for email notification, another for “profile”. Since some settings don’t necessarily exist, I handle them in, say, a UserNotification class (usually a delegate of User). Therefore, when someone creates a new notification, no schema change is necessary. It’s just a value that may or may not exist in the hstore (which is just a field of User).

Hope that makes sense.


(Andy Wootton) #39

@Rythie & @halfer 2 more things I’ve learned today. I’ve been nowhere near MySQL since Oracle bought Sun. 3 things if you include that Oracle Linux is still going. Why would Facebook & YouTube cosy up to Oracle rather than the MariaDB community?


(Marc Cooper) #40

postgresql goes further than stealing. It surrounds and consumes! It has Foreign Data Wrappers (FDWs) that allow you to wrap an entire database engine and use it from with in postgresql.

Here’s an article where someone embeds redis in postgresql:
http://www.craigkerstiens.com/2012/10/18/connecting_to_redis_from_postgres/

Here’s the list of available FDWs:
https://wiki.postgresql.org/wiki/Foreign_data_wrappers

Facebook within postgresql? Yes you can!

I am not recommending any of these, btw :wink: