Tag vite :

About this site (2)

 Posté par greg le 4 juin 2010 22:36 |  Les commentaires sont désactivés pour ce post.
Tags : geek  computing  postgresql  dev  php  symfony 

Everything will be taggable

At first glance, it’s easy. Dealing with tags is just about sticking strings to database records. The first try I have made led to this plugin The idea was to bind mongodb records to Doctrine objects. I began to use it and then realized plain tags are … useless. It ends up with millions tags, deprecated or inaccurate tags with no way to categorize your content efficiently.

The answer to this problem is simple: tags are good but hierachical tags are better ! If you save a picture with the tags «cantal» and «cheese», this picture will be automatically tagged «europe» «france» «cantal» and «food» «dairy product» «cheese» and will appear in the results if you search any of these tags.

The problem was now, how to store such structure and be able to perform searchs in one query. Because if you look for the contents tagged dairy products, you need to first look for this tag and then all the contents which are tagged with it or with one of its children. Tree structure fails at optimizing this solution because the first search is lost somewhere in the tree: exit LDAP or nested tree. I found no way to modelize this with mongoDB either (The use of materialized paths has been explained since then) ... I was in a technological dead end until I found Postgresql’s LTree structure

That was it ! I changed the tags field from varchar into an array of LTree. I did not want to create a n-m relation because tags were going to be URLs.

coolkeums=> d ck_content
                                       Table "public.ck_content"
    Column     |            Type             |                        Modifiers                        
---------------+-----------------------------+---------------------------------------------------------
 id            | integer                     | not null default nextval('ck_content_id_seq'::regclass)
 created_at    | timestamp without time zone | default now()
 title         | character varying(255)      | not null
 tags          | ltree[]                     | not null
 slug          | character varying(255)      | 
 is_visible    | boolean                     | not null default true
 used          | integer                     | default 0
 preview       | character varying           | 
 culture       | character(2)                | default 'fr'::bpchar
 ck_user_login | character varying           | not null
Indexes:
    "ck_content_pkey" PRIMARY KEY, btree (id)
Check constraints:
    "ck_content_slug_check" CHECK (slug::text ~ '[a-z0-9-]+'::text)
    "culture_lowercase" CHECK (culture ~ '[a-z]{2}'::text)
Foreign-key constraints:
    "ck_content_ck_user_login_fkey" FOREIGN KEY (ck_user_login) REFERENCES ck_user(login)

With this structure, if we use the VIEW defined last time, finding contents by tag is as easy as

coolkeums=> SELECT id, type, tags FROM ck_content_v WHERE '*.postgresql.*' ~ ANY (tags);
 id |    type    |                            tags                            
----+------------+------------------------------------------------------------
 34 | ck_link    | {geek.computing.postgresql}
 23 | ck_article | {geek.computing.postgresql,geek.computing.dev.php.symfony}
 37 | ck_article | {geek.computing.postgresql,geek.computing.dev.php.symfony}
(3 rows)

How can I use this with symfony ?

The answer is simple : you can’t. As soon as you want to be database agnostic, there is no way you can support these features. This means neither Doctrine nor Propel will do the job : no support for regular expression operators, arrays and LTree. This is where the story of sfPgLookPlugin starts :

  • raw SQL queries with prepared statements
  • take advantage of inheritance
  • one layer from the database to preserve performances
  • use all types Postgresql can offer you
  • translaters to convert SQL arrays to PHP arrays, boolean etc
  • possibility to have query objects

On the other hand, going with a prototypal ORM also meant:

  • no admin generator
  • manage fixtures and data structure on my own

Database to model inheritance

If my database use inheritance, also my model and form classes should!

class CkContentMap extends PgLookBaseObjectMap
{
  protected function initialize()
  {
    $this->connection   = PgLook::getConnection();
    $this->object_class =  'CkContent';
    $this->object_name  =  'ck_content_v';
    $this->field_definitions  = array(
      'id' => 'PgLookIntType',
      'created_at' =>    'PgLookTimeStampType',
      'ck_user_login' => 'PgLookStrType',
      'title' =>         'PgLookStrType',
      'tags' =>          'PgLookArrayType[PgLookLTreeType]',
      'slug' =>          'PgLookStrType',
      'is_visible' =>    'PgLookBoolType',
      'type' =>          'PgLookStrType',
      'used' =>          'PgLookIntType',
      'culture' =>       'PgLookStrType',
      'preview' =>       'PgLookStrType'
    );
    $this->pk_fields    = array('id');
  }

Here, the classe is mapped to the database _ck_content_v_ VIEW. Records retreived from the database will have the «type» information. This is coherent because a content can not be saved as is, it has to be a link or an article or any of content’s children but it cannot be just a content. Trying to INSERT OR DELETE on this structure will result in an exception. The article‘s definition is now very simple:

class CkArticleMap extends CkContentMap
{
  protected function initialize()
  {
    parent::initialize();
    $this->object_class =  'CkArticle';
    $this->object_name  =  'ck_article';
    $this->addField('content', 'PgLookStrType');
    $this->addField('allow_comments', 'PgLookBoolType');
    unset($this->field_definitions['type']);
  }

The fields declarations are just here for PgLook to know what to INSERT or UPDATE in the database. It’s possible to SELECT extra data and access them as any “regular” field. In the following example, you can can filter articles upon a WHERE condition passed as parameter and the comments count associated to the corresponding article are returned as «comment_count»:

 public function findWhereWithCommentCount(PgLookWhere $where)
  {
    $sql = sprintf("SELECT art.*, count(comm.id) AS comment_count FROM ck_article art LEFT JOIN ck_comment comm ON comm.article_id = art.id WHERE %s GROUP BY %s ORDER BY art.created_at DESC", $where, $this->getGroupByFields('art'));

    return $this->query($sql, $where->getValues());
  }

Like in Doctrine, the getCommentCount() accessor is dynamically created :

<?php if ($article->getAllowComments()): ?>
<a href="#comments"><?php echo format_number_choice('[0]No comments|[1]One comment|(1,+Inf]%count% comments', array('%count%' => $article->getCommentCount()), $article->getCommentCount()) ?></a>
<?php else: ?>
<?php echo __('Comments are deactivated for this post') ?>
<?php endif ?><br />

And of course, here is how to fetch all contents associated to a tag in the CkContent class :

public function findByTag($tag)
  {
    $tag = sprintf('*.%s.*', $tag);

    return $this->findWhere('? ~ ANY(tags) ORDER BY created_at DESC', array($tag));
  }

Liens récents 

 Crew - code review for Git projects.

Marqué par greg le 6 mai 2012 07:57..
Tags: geek computing git

Suivre le lien

 responsive grid system

Marqué par greg le 5 mai 2012 12:38..
Tags: geek computing dev

Suivre le lien

 Vim "improved

Marqué par greg le 8 mars 2012 09:26..
Tags: geek computing vim

Suivre le lien

 Highly scalable NoSql blog

Marqué par greg le 3 mars 2012 09:12..
Tags: geek computing dev

Suivre le lien

 Bruce Momjian Pg blog

Marqué par greg le 16 février 2012 13:51..
Tags: geek computing postgresql net blog

Suivre le lien

 Richard Miller's blog

Marqué par greg le 16 février 2012 09:57..
Tags: geek computing dev php symfony net blog

Suivre le lien

 Online SQL explain interface

Marqué par greg le 15 février 2012 09:58..
Tags: geek computing postgresql net

Suivre le lien

 Ace online code editor

Marqué par greg le 15 février 2012 08:57..
Tags: net geek computing dev javascript

Suivre le lien

 Coding horror

Marqué par greg le 14 février 2012 12:42..
Tags: geek computing dev net blog

Suivre le lien

 telegraphCQ - Stream oriented database

Marqué par greg le 13 février 2012 14:44..
Tags: geek computing postgresql

Suivre le lien