Tag vite :

Your queries will never be the same again

 Posté par greg le 17 février 2012 17:07 |  Les commentaires sont désactivés pour ce post.
Tags : geek  computing  postgresql  dev  php  pomm 

It took me months not to say years to figure out. I was first frustrated because ORMs did not let me use the cool features in Postgresql like types and operators, this is why I made Pomm. But, this morning, I went out from bed and I just discovered that you do not need an ORM with Postgres because it can deals directly with objects. This may be hard to understand so in this article, I’ll try to go trough the process of my understanding.

the good granpa’s sql

In my previous life, SQL was about querying tables, meaning «please postgresql get me all the fields from this table that satisfy this condition;». It was a bit painful to manage relations between tables this way because this led to many queries. If by example, in a blog system, I have an author table with a post table that contains all the blog posts from the authors. There is a 1-n relation between theses two tables and this is 80% of the relations you meet in a relational database. If you want to retrieve all information in one query you had to:

SELECT 
  author.*, 
  post.* 
FROM 
  author 
    CROSS JOIN post ON post.author_id = author.id 
WHERE 
    author.id = 1;

| id |   name   |  id  |       title       |  ...
+----+----------+------+-------------------+-----
|  1 | John Doe |   1  | 'my first post'   |
+----+----------+------+-------------------+-----
|  1 | John Doe |   2  | 'here again'      |
+----+----------+------+-------------------+-----
|  2 | Edgar    |  47  | 'freaking news'   |
+----+----------+------+-------------------+-----
|  2 | Edgar    |  48  | 'once upon a time'|
+----+----------+------+-------------------+-----
|  2 | Edgar    |  49  | 'hu ... wait!'    |

As you know, this leads to a number of lines all containing the same informations about the author repeated the number of times the blog posts this author has. So what did 99.999% of the developers (including me) is: making two queries, one to fetch the author informations and another one for the associated blog posts. ORMs do that transparently using the getters like getBlogPosts() on a author instance and then keep everything in memory.

Let’s use objects and arrays!

Postgresql implements arrays and that’s a big big feature. And because Pg developers are very smart, they even provide us with a function that can aggregate rows in an array. Some of you can maybe understand where I am going right now but the problem is to put all blog posts information in arrays… how can that be ? It is very simple, you can handle table results like objects.

SELECT author FROM author;

|    author      |
+----------------+
| "(1,john doe)" |
+----------------+
| "(2,"Edgar)"   |
+----------------+

In the example below, I ask Postgesql to return me a composite form of the author table, because in Pg, defining a table is also defining a composite type, it is possible to return table results as complex types not to say objects! So, finding a blog author with all its blog posts is as simple as doing

SELECT 
  author.*, 
  array_agg(post) AS posts 
FROM 
  author 
    LEFT JOIN post ON post.author_id = author.id
GROUP BY
  author.id;

| id |   name   |  posts
+----+----------+--------------------------------------------
|  1 | John Doe | {"(1,my first post)","(2,here again)"}
+----+----------+--------------------------------------------
|  2 | Edgard   | {"(47,freaking news)", ... }

Ok, how does it work with Pomm ?

Maybe you hoped that I would tell you it is very difficult and I am very sorry because it is absolutely not the case. Let’s imagine the method in the model that would fetch such a result set:

<?php #model/YourDb/YourSchema/AuthorMap.php

public function findAllWithPosts() 
{
    $post_map = $this->connection->getMapFor('YourDbYourSchemaPost');
    $sql = "SELECT %s, array_agg(post) AS posts FROM %s LEFT JOIN %s ON author.id = p2.author_id GROUP BY author.id ORDER BY author.id ASC";
    $sql = sprintf($sql, 
            join(', ', $this->getSelectFields('author')),
            $this->getTableName('author'),
            $post_map->getTableName('post'));

    $this->addVirtualField('posts', 'Post[]');

    return $this->query($sql);
}

So here you recognize the query we saw previously. This will hydrate a Author instance and we add the extra field posts with the array results. That’s cool but what we would like here is to be able to ask Pomm to transform this mess in an array of Post instances.

To do so, we add our Map file an extra virtual field named posts associated to the Post[] converter. The Post type must be registered to the database associated with the entity converter on Post entities. The ‘[]’ prefix is here to precise this is an array of this type. During the conversion process if there is a result field named ‘posts’ this converter will be used and our Author instance will by provided with an array of Posts under the posts accessor.

For those of you who are interested to play with arrays, you will be glad to hear Pg also proposes an unnest function that breaks arrays into rows making your CTEs more fun. Take care.

Liens récents 

 Crew - code review for Git projects.

Marqué par greg le 6 mai 2012 07:57..
Tags: geek computing git

Suivre le lien

 responsive grid system

Marqué par greg le 5 mai 2012 12:38..
Tags: geek computing dev

Suivre le lien

 Vim "improved

Marqué par greg le 8 mars 2012 09:26..
Tags: geek computing vim

Suivre le lien

 Highly scalable NoSql blog

Marqué par greg le 3 mars 2012 09:12..
Tags: geek computing dev

Suivre le lien

 Bruce Momjian Pg blog

Marqué par greg le 16 février 2012 13:51..
Tags: geek computing postgresql net blog

Suivre le lien

 Richard Miller's blog

Marqué par greg le 16 février 2012 09:57..
Tags: geek computing dev php symfony net blog

Suivre le lien

 Online SQL explain interface

Marqué par greg le 15 février 2012 09:58..
Tags: geek computing postgresql net

Suivre le lien

 Ace online code editor

Marqué par greg le 15 février 2012 08:57..
Tags: net geek computing dev javascript

Suivre le lien

 Coding horror

Marqué par greg le 14 février 2012 12:42..
Tags: geek computing dev net blog

Suivre le lien

 telegraphCQ - Stream oriented database

Marqué par greg le 13 février 2012 14:44..
Tags: geek computing postgresql

Suivre le lien