migration to joe’s datacenter

Did it again. However, this time I’m getting a dedicated server instead of cloud virtual servers based on one blog post.

Since I do not have virtual servers, my environment is rather limited.
So I decided to just go with apache instead of nginx and everything is vhost.

It’s been a few weeks since I migrated and everything has been so far so good.

Look at your html page as xml data for the sake of SEO

I’ve been working on a script that goes to a URL and scraps some parts of data, which is pretty much a crawler or spider.

If all pages that the crawler landed were valid, my job would have been so easy. However, in reality many many pages are not valid and the script has to use regular expression.

This can be a good or bad thing for those web owners.

However, exposure is very necessary in terms of marketing for the site and valid html page means it has greater chance to get exposed by search engines such as google.com because valid html page will provide what search engine crawler wants more efficiently.

I believe engineers who work on those crawler have overcome many difficulties due to the invalid markup on a page. However, if HTML in a page is not valid (treating it as a xml), those smart engineers would have to come up with a logic to overcome that by using regular expression perhaps. That could be prone to mistakes so lead to scrapping only few from invalid HTML in a page. After all engineers are human and human make mistakes.

Also just for the same reason, if well formed semantic HTML is used, it will have higher chance to get exposed to a certain keyword typed by users.

That’s just my idea of how html page has to be constructed considering SEO and future use.

So my recommendation is this:

1. Treat markup in a page as data. Forget about presentation and such. Just make sure the data is valid.
2. Use CSS to visualize the data (= HTML markup) to appeal users

It’s quite simple after all.

Battlestar Galactica: Blood & Chrome

I watched Battlestar Galactica: Blood & Chrome last night and felt it was too short. One thing I liked about Battlestar Galatica series was that the whole storyline weighed on philosophical views (and, of course, battle scenes). However, this short series did not have one (’cause it was too short).

At the end I enjoyed it because I am a huge fan of Battlestar Galactica. (I even liked Caprica so much)

Here’s link to amazon:
http://www.amazon.com/gp/product/B00BHNP3SI

facebook becomes another press mockery?

You remember when Yahoo gets ridiculed by bad presses due to low stock prices and they say Yahoo needs great products?

It seems like facebook is another one that gets ridiculed by them and people say facebook needs a big product that can get it back to where it stays above IPO price.

Funny how things are after IPO.

Yahoo news’ comments are fun to read

I see the resemblance of reddit in Yahoo news’ comments.
They get great amount of comments from so many users and similar to reddit’s.
It’s just my personal observation… that’s all.

Continue Integration setup at new work place

Recently I finished setting up Jenkins for starting CI at the new work place. The minimum requirements for the environment were:

  • Jenkins server
  • centos 6.2 (I could’ve used a different os, but centos was the company wide server os)
  • VirtualBox for integration environment for component tests
  • PHPUnit
  • PHP DbUnit
  • PHP CodeSniffer (I created company standard based on PEAR standard)
  • PHP Mess Detector
  • ant (initially I started phing, but decided to go with ant)
  • phplint (php -l) 🙂

Our developer’s code coverage agreement is 70% or more. It’s working pretty nicely and it definitely gave our developers more confidence in the code and system that we are building although I am not 100% satisfied with the current setup yet because I really wanted to implement package deployment system. (For that I am generating tar file only when jenkins job passes and the file gets pushed to integration environment for component tests)

When I have some time, I will write about more details.

upgrading PEAR in centos 5.8

My OS was centos 5.8 and was trying to install Phing for PHP Application packaging. However, installation would fail because of the old PEAR that I had. The pear that I had was from yum installation.

It turned out that I had to manually upgrade to the latest pear and the instruction that I found was at:

http://pear.php.net/manual/en/installation.getting.php

If old pear from the installation via YUM is on the OS, even updating PEAR itself would fail. Only way to upgrade is to do it manually per the instruction.

wget http://pear.php.net/go-pear.phar
php go-pear.phar

The one that I used for the base dir during the installation was :

/usr/share/pear instead of /home/${USER}/pear.

My 2nd last day at Yahoo!

Today was my last day at Yahoo!. The press says bad things about Yahoo!, but actually the work environment and benefits are very good comparing to other companies.

The reason of my departure (2nd time) is mainly because I want to become an entrepreneur one day and set my career direction toward it. I do not think I can be it right now, but I want a company where I can learn about it quickly and expand my network in the south bay.

hulu. really?

I am a paid subscriber of Hulu plus to watch shows for exclusive hulu plus customer. Now it requires DISH network???

I do not get it.