cellio: (avatar)
Monica ([personal profile] cellio) wrote2014-07-10 07:46 pm
Entry tags:

learning new patterns

Coming to the world of SQL databases from the world of object-oriented programming is...different. I'm starting to realize why some idioms are different, and I'm sure there are tons more that I haven't noticed yet and am probably getting wrong. But that's what learning experiences are for.

Consider, for example, a system where you have authors with associated publications. If I were designing a system to track that in, say, Java, I would define an Author class and a Publication class, with bidirectional links (Author would have a collection of Publications; Publication would have a collection of Authors (because sometimes authors collaborate)). But in a database table design you don't do that; you define a Persons table that has columns for some unique ID, name, and anything else about the person, and you have a Publications table that has columns for things about the publication like a (book) unique ID, title, publisher, genre, etc, and also the unique ID from the Persons table for the author -- and I'm not sure if multiple authors means multiple rows in the Publications table or if there's some way to do collections. But the point is that a Person doesn't know about its publications -- when you want that you'll do a JOIN between the two tables and then you'll have what you need. Connections between flavors of data are external to the data. This makes sense, but it's going to take a little getting used to.

(Y'all who are way ahead of me on this should please feel free to point out any errors in the above and save me mis-learning some things. Thanks.)
dsrtao: dsr as a LEGO minifig (Default)

[personal profile] dsrtao 2014-07-11 02:08 pm (UTC)(link)
Your schema holds the data you think is important, but in any non-trivial project, this will change repeatedly over the course of the work. An awful lot of people seem to think that it's a good idea to start by defining your schema, and then never think about it again. They are almost certainly wrong. However, if your SQL is tightly linked into your code, it will be difficult to change your schema without changing everything else.

That's why everybody* uses a toolkit. If it's a good toolkit, it does what you want long-term, including schema migrations and reversions and differentiating test environments from production. If it's a bad toolkit, it does what you want short term but breaks when you outgrow it. And if it's an awful toolkit, it doesn't do anything better than writing a library of SQL statements with placeholders and bind variables wouldn't do better.

https://en.wikipedia.org/wiki/List_of_object-relational_mapping_software is not comprehensive, but is a plausible overview.



*No, not really everybody. But most people on most projects.