POC on Elastic Search – with CouchDB & jQuery Autocomplete

Recently I try to search some code in Github and I noted from the blog the engine behind is ElasticSearch (One interesting thing is its detailed & technical explanation for its recent outage)

Elastic Search – From the official site

We want our search solution to be fast, we want a painless setup and a completely free search schema, we want to be able to index data simply using JSON over HTTP, we want our search server to be always available, we want to be able to start with one machine and scale to hundreds, we want real-time search, we want simple multi-tenancy, and we want a solution that is built for the cloud.

Elastic Search is based on Lucene (so far I dont find any non-open source popular search engine use other indexer). Therefore its position is somewhat close to Apache Solr. So some Discussions for “elastic search vs Solr” and I recommend this deep review

I just tried to setup elastic search with my couchDB applicaiton, that really just took me 20minutes. Following the official guide will be very useful.

CouchDB and Elastic Search is definitely a good fit. CouchDB use JSON to store the document, while Elastic Search is able to render the document fields itself – so basically the whole couchDB document can be put into the index / as retrieved document.

That also means code– javascript that handle such JSON objects can be possibly reused.

(While I believe search should not replace couchDB’s native rendering in basic use cases, for it is faster and sorted etc.)

This is also an event-driven-model.  With the River concept, and the stream _change couchDB provides,  changes will be submitted to the Elastic Search and for indexing. Convention over configuration, no explicit setup of filtering or other things it just works. Percolation is a very interesting feature that one can register queries against index and send percolate requests with document and get back which queries are matched.

On the other hand, I spent more time to get the search UI working. The place I look for insights is definitely the ElasticSearch site itself, which is based on jQuery UI autocomplete. Quite nice combination, while one need to get familiar with the API / css. This Plugin is also needed to render generated HTML correctly

Luckily I get to know this at the right time. Originally I did not intend to add search feature to this application at the moment, this is more for POC. I am really considering to use Elastic Search instead of Solr for my cross platform search application. Most interesting part is its replication model, as I want to put the index around, secondly is the JSON REST API, as I would like to use javascript for cross platform UI.  While I need to dive in more for the indexing part.

My Couchdb applciation – Kanso, CommonJS, Bootstrap, LESS, Google Map API

My application is a simple one which allow users to mark spots of light pollution on a Google Map.

->More [later]

It turns out to be very fit to for my experiment with CouchDB. The idea is to also to do everything in javascript. Recently be a Doug Crockfold fan, listened his talk, reading his javascript bible and happens to read his interview in Coders at work

CouchDB

CouchDB is an awesome idea which is mature enough to work. As its official page put it

Apache CouchDB™ is a database that uses JSON for documents,
JavaScript for MapReduce queries, and regular HTTP for an API

Although founders has been moving onto the commercial version, couchBase, as couchDB is under apache and I believe it will keep gain enough momentum.

CouchDB is a DBMS, however with the attachment model, one can put html,program logic, images in it and it can function as a complete application! [More on its architecture later]

The downside is the programming model is not very intuitive and the tools are yet limited. However, fortunately it does make things simple enough that it is manageable to learn. It turns out the codebase can be really small for an application with basic CRUD features and it is by default scalable (Map reduce!). Especially when I compare to my other app with stack Play/Hibernate, this is really easy to start.

About the model, it forces you to think about what exactly you want to be shown in presentation layer and how it relates to the data structure during the design. You want to show records from database, while what exactly is range, how you sort it, things like that. You are display pages using template, how is it rendered, how caching can be done.  At the same time you are taking care how is it being indexed. (View is index in CouchDB).  It sounds like coupling data and view, but it is not. There are no complex query and it is not SQL, it is simple list/view and what you stick to is the business requirement.

For tool, I have been using couchApp, which is initially the standard little framework allow one to quickly leverage couchDB to build an application.  The biggest problem is it is not documented very well. It is now also deprecated and replaced by Erica

This is also what recommended by the “official” and most complete guide, CouchDB The Definitive Guide . It tells you some details on design insight about couchdb, however be frank even the published version it is still a work in progress which is quite hard to read.

Kanso

I am now using Kanso

Comparing them, there are some difference in the layout and file structure. I will say the features in Kanso are more completed, like generate id & upload. It is based on node.js, where the most beneficial part is the module management. npm, using require as in CommonJS. I have been using requireJS (AMD) before and I found this is even easier, although they may not be the exact same concept.

One feature very useful is to have the autodeploy (autopush) whenever you update your code. This is done by Watchr, guide here, smart choice. easy to install, light-weight and generic enough for any app. The script can be customized, which is also useful in case compile from .less to .css is required.

As I take reference to the sample project, I use the framework duality. I believe my application should not be bound to it while I need to dive more into its ideas. Currently some limitations are like it has dependency on old jQuery version. Also seems it drives a bit different from the original model, for example I found it hard to render an image with show, without generating with html template.

For deployment, I am trying with the CloudAnt on CloudBees. After all what I need to do is a “push”, it feels really like git.

Image storage

Another requirement for the application is image storage and processing. I need to allow users to upload images, which “saving images as couchDB” should be the simplest model.

To scale one also suggested using Amazon S3 separately. Regarding the storage platform, the considerations for me is mainly the price, then the speed.

The whole thing are composed for several parts.

  1. How to resize the image before upload (client side!) & restrict the size.

    Numerous way to do, one make use HTML5. I believe I will use this jQUery plugin which got all basic stuff. Another choice1 and 2)plupload which is really cross platform can can fall to silverlight/flash

  2. Upload via REST APII am not sure there is no upload for attachment so I used basic curl at the moment. I am glad to write one for kanso.
  3. Render the image with correct path & cache

    I am still trying to get couchDb right, should I render with b64 or use the attachment path directly? I turns out it is not that easy.

  4. Make sure I can still pay for the volume

Then localization

Not much to do, just want a correct solution to have English & Chinese version for my site.

I have being working as a summer intern in a startup explicitly for the localization of a web page. (I literally did the translation!)

This library is more than enough. https://github.com/fnando/i18n-js

i18n – 18 stands for the number of letters between the first i and last n in internationalization. God knows.

I also found people are doing generic framework to take care plural or gender stuff, like messageformat.js Quite interesting.

I think the important part is to get it right with handlebars, like here, what should be the order of parsing to speed things up.

For “on-the-fly localization”, we get the string to localize first then lookup in the table.
More basics for web [TODO,Moving to another posts]

Layout – Bootstrap, LESS

The only thing about Bootstrap I will say is it is turning layout design back to a programmatic way which is good for me. Find something useful and looks nice then plug it in. This should be a better layout than I did before.

I have been knowing these stuff, but put a real application in place is another story.

Haven’t do serious web programming for a while (or never).  Many things to explore and best practices to learn form.

  • Naming convention – div class ,  javascript variables, static files,

  • dont put boostrap class tag – separate view layer – how LESS can help
    Seems my less compiler cannot compile as it cannot import external css. In this case I will need to break down the bootstrap to use its less file directly.

Frameworks I stated make my life much more easier, especially for newbie and layout idiot like me. Meanwhile, many things I do consider can be done much simpler. like rendering Grid with same height.

  • The forever battle – Grid like layout, not to table but how. 

    It has long be considered a bad practice to use table for layout. However the point is it is also not easy to do with divs and css. There is even a timer like this. http://giveupandusetables.com/
    Easy for the case of using fix height, however not so when the height is dynamic.
    You either use jquery or css trick to accomplish 

  • Best practices for caching & web app performance 

    Some good reads: Google best practices, Even Faster Web Sites

  • What to do for responsive CSS for mobile

Cross Platform Javascript framework

Recently during my research I come up with Zotero for citation. (will talk about that topic later)

To my surprise this app on my mac is built on javascript. I have been thinking to use javascript (instead of / besides java) for a cross-platform app, at least for the UI part. While this is the first time I get some stuff in the wild and, most importantly, it is not bad.

Then I studied a bit on “cross-platform javascript Frameworks”

Zotero is based on XUL, which is Mozilla’s XML-based language for building UI. I know it works while it may not be that portable to mobile and those XML may be not my cup of tea.

Many Frameworks

Another post for even more frameworks (Chinese)

In the first one there is also insights about the distribution

From Eclipse Open Source Developer Report 2012

60% of open source Developers of Android / iOS use only official SDK

those who use cross-platform frameworks

– jQuery Mobile (28.6 percent)

– PhoneGap (17.9)

– Sencha Touch (7.9)

– Dojo Mobile (4.9)

– Titanium (2.8).

Most are Javascript/HTML5/CSS stack.

Some frameworks support only on mobile device, while what I want is also Mac / Windows app (I am lazy)

Thus jQuery Mobile is out of choice.

Update: JQuery Mobile and other frameworks like PhoneGap are not Exclusive but instead complementary. to put it simple – jQuery focus on Layout

There are also comments that framework like jQuery, Ext, MooTools are slow on devices, as their cores focus on cross-browser compatibility, which is unnecessary in mobile devices. (comment by another framework XUI which is also their problem statement)

For Titanium seems it  supports mac app. Uses backbone.js. more info how it works Marketing of it is awesome, a bit too good to be true. I am concerned whether it is open enough, easy to learn and popular. With the percentage seems not true at least for the last thing. and XML again.

I drilled down some more that are Mac-enabled

Another mobile framework PhoneGap is also popular. Mac port for it.  Its engine is now Apache Cordova after the donation
Nice post about its mechanism. Compared to Titanium which will convert into native code for execution,  PhoneGap or another framework AppMobi provide a native app container to run JS/HTML/CSS.

MacGap is a similar one while Mac only

 

Then from this SO thread I see about chromium embedded (CEF)

http://code.google.com/p/chromiumembedded/

As it name implies, basically it is put a chrome inside an app. It intercepts any http request and invoke corresponding local call when needed.

It supports C/C++ and port for many languages, which means it can do many low level stuff.

The concern so far I heard is hard for debugging . Sounds like future stuff though, just give it a try later.

AppJS looks promising & more mature. Seems based on CEF itself and use Node.js. Meanwhile this is for Mac/Linux/Windows but not mobile.

gives Chrome console for debugging. To Watch.

http://stackoverflow.com/questions/13106120/appjs-compile-into-single-executable

While I think stuff like interacting with filesystem with reasonable performance is still long way to go. Clustering part or deuplication is more likely to be done in Java.