Blocking unsupported requests on nginx

Recently I have been observing a lot of traffic of bots, crawlers and malicious users on a project I am working on. It is a rails application.

Lot of crawlers trying to access pages like /phpmyadmin/manager.php etc. All of them ending in .php. Similarly lot of requests ends in .cgi, .xml, .jsp.

My application does not support such request types. nginx transfers these requests to the rails router, and since they do not exist, it returns back an wroung route exception. I do not want to catch and monitor these exceptions. In fact I do not want to these requests to even reach the application server. I think unsupported request types should be handled at the level of web server i.e nginx.

And there is a way to do it. Make this entry in nginx.conf file, under the server block as:

location ~ \.(aspx|asp|php|jsp|cgi|xml)$ {
    return 410; 
}

It matches the unsupported the request types and returns a 410 status code.
410 basically means that the page you are looking for has been moved permanently to some other location and has no forwarding address.

The difference between 404 and 410 is that 404 means that page does not exist or has been moved temporarily/permanently. And 410 means that page has been moved permanently to a new location or has been removed permanently. And the main use case of 410 is when you want to inform the search engines that it might be the time to reindex as this routes no more exists. And also I could 410 for blocking unsupported request types as shown above.

If you want to read about more options available for nginx, read it here.

Postgres in Yosemite

I have just updated to Yosemite and my worst fears have come true. The first problem that I have faced so far is running postgres. As soon as I try to start rails server, I get this error:

 
"could not connect to server: Connection refused (PG::Error)"

So, it is clear that my postgres server is not running yet. So when I try to start postgres by this command:

postgres -D /usr/local/var/postgres/

The error I am getting is this:

 
"FATAL:  could not open directory "pg_tblspc": No such file or directory"

After doing some research on internet, I found that these are indeed directories under /usr/local/var/postgres/

So now I look for this file or directory under /usr/local/var/postgres/, but no such file or directory exists. So I create this directory under the path as:

 
mkdir /usr/local/var/postgres/pg_tblspc

And now when I try to start the server, I get a new error:

FATAL:  could not open directory "pg_twophase": No such file or directory

So again after looking for this file/directory, since it does not exists, I created the directory:

mkdir /usr/local/var/postgres/pg_twophase

And now when I try to start the server, I get a new error:

LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": No such file or directory

So I create this directory also:

mkdir /usr/local/var/postgres/pg_stat_tmp

And finally the postgres server starts. Yayyy…!!!

Rumbling at RailsRumble, 2014.

RailsRumble. (http://railsrumble.com)

It is a 48 hour hackathon where you have to build any web application using ruby based web framework. It was my second time participating in this event. This years RailsRumble was very difficult for me. Because this time I was working alone, not part of any team as I was not in Pune at JoshSoftware’s office but in Chicago. So I decided I might as well try working alone.

A couple of weeks before the event I had decided that I will work on ‘RubyInSense’. The main purpose of this application would be to provide ruby developers the platform to solve tricky ruby problems in ways that could surprise even the experts.

Here is the link to application: http://rubyinsense.r14.railsrumble.com
Here is the entry in railsrumble: http://railsrumble.com/entries/38-rubyinsense

This is the photo of RubyInSense mascot

profile

Things I have learned at RailsRumble this year:

1. Working alone is very very difficult. It is always better to work in a team. It was 4 o’clock in the morning when the thought of quitting came to my mind. Although I did not. I saw light at the end of the tunnel 🙂

2. Learned a couple of card tricks. Even though it was a 48 hour event, you just cannot work for 48 hours. You still need time to sleep, eat and relax. I usually watch some television show or a movie to relax, but they are very time consuming. I could not afford to watch television. So I learned a couple of card tricks which is much more fun and it gives my eyes time relax as I do not have to look at another screen.

3. Do not try to do it all. As I was working alone, I had to take some calls on reducing the scope of the application. I had to remove some ideas for ex: adding comments to question and answers, user profile page, from the application. So even before the event started, I had chalked out a plan as how shall I complete within the 48 hours. But what I did not thought was the time I will spend in completing the documentation, setting heroku, github, and there are always bugs of “it-worked-on-my-local” kind, so I needed to go through the logs. All these when added up takes time.

4. Eating Melons helps you concentrate. Whenever I was stuck at any point of this event, I tried eating Melons and it worked wonders for me. At one point, I just could not figure out why the modal pop-up has stopped working all of a sudden. I had spent good half-an-hour on this problem. My sister suggested me to eat some Melons and as soon as I looked at the assets, I realised that I have 2 different copies of bootstrap.js. Some people drink coffee to stay awake and to help concentrate, I eat Melons. Here is the proof that I am not lying and I ate this whole bowl of Melons just to get my brain working.  melons

What RubyInSense also offers:

1. Browse questions added by other users
2. Add Questions – It allows you to add questions and challenge your fellow rubyists to solve.
3. Add Answers – It allows you to browse all the questions and add answers to those.
4. Learn – It also gives you the chance to look at the solutions added by other users.

Whats the roadmap for the future:

1. Allow the user to add comments to questions and answers.
2. Open the registration process for the users. It is not accessible right now as to make it easier for everyone to judge the entry.
3. Add User profile page.
4. Add test cases.
and a lot more….  I plan to open-source the whole project. So I will add all these points to issues page. If any one wants to look up for the project on Github, I will create a new repo after a few days under ‘rishijain’ and name most likely be ‘rubyinsense_v2’. I will update the link here anyways.

Here is the link to see all the entries: http://railsrumble.com/entries/all
Here is the ‘RubyInSense’ in railsrumble: http://railsrumble.com/entries/38-rubyinsense

I already hope to participate in RailsRumble next year also.

Excel Report using axlsx gem

Axlsx is an excellent gem for creating and formatting the excel report for your rails or just ruby application.

This not a RTFM blog. If you are looking for basic setup, you can refer the official examples page or read this blog written by a friend of mine.

This blog is about the problems I have faced while using this gem and discoveries I have made while using this gem.

I was faced with issues even before I have written any code to write an excel report. It was because of version of axlsx and its runtime dependency gems. Axlsx require rubyzip to run.

First I have to deal with this error: uninitialized constant Zip::DOSTime error. 
It was because of wrong version of rubyzip. It was a deprecated version of rubyzip. It internally uses zip-zip interrupted with axlsx. So, I upgraded the version of rubyzip and this error popped up: uninitialized constant Zip::OutputStream. It was again due to versions of both the gems, and to resolve this I had to again change the version of both the gems.

Finally I settled on these version of:
1.) axlsx 1.3.6
2.) rubyzip 1.1.6

Now, as soon as I was out of the version dependency issues, or at least I thought so, when I precompiled the assets I go this error: cannot load such file — zip/zip. This is because rubyzip >= 1.0.0 has required zip like this:

require 'zip'

And rubyzip at <= 0.9.9 has code as:

require 'zip/zip'

So if you have some other gem in your application which is also dependent on rubyzip(<= 0.9.9), then just change your version of rubyzip to 0.9.9. For example: I was also using ‘roo’ gem which also needed rubyzip <= 0.9.9. So, to resolve the error which came up while precompiling, I finally settled at:
1.) axlsx, 1.3.6
2.) rubyzip, 0.9.9

One of the other challenges I faced was setting the height of any row. It is missing in the official examples page by the time I am writing this blog.

Here is how you can do it;

wb.add_worksheet(:name => "fixed row height") do |sheet|
  sheet.add_row ["This row will have a fixed height", "It will overwrite the default row height"], :height => 30
  sheet.add_row ["This row can have a different height too."], :height => 10
end

It will overwrite the default row height. The default row height is 18.
And style to any row can be added along with height, like;

wb.add_worksheet(:name => "fixed row height") do |sheet|
  sheet.add_row ["This row can have a different height too"], :height => 10, :style => custom_style
end

You can see the full example here: https://github.com/rishijain/axlsx/blob/patch-1/examples/example.rb
I have also submitted a pull-request and if you want to follow up with it, you can do it here: https://github.com/randym/axlsx/pull/354

Block bots from crawling your application.

Search engines like google, bing have bots called googlebot, bingbot which crawl your web application for SEO purposes. But it might be the case that you do not want these bots to crawl your app. How to stop these bots from crawling?

On your server, there exists a file robots.txt file. It is usually found at /home/username/public/robots.txt

User-Agent: *
Disallow: /

User-Agent: * means that it includes all the bots.
Disallow: / means that it disallows all the bots included above.

User-Agent: *
Disallow:

This means that it allows all the bots (*) to crawl the site.

User-Agent: Googlebot
Disallow: /

This means that it blocks only googlebot from crawling the site.

Now what if you want to block a particular file or a particular folder from these bots. Here is how you could do that:

User-agent: *
Disallow: /path_to_some_file/filename.pdf
Disallow: /logs/
Disallow: /tmp/

This means it does not allow the bots to crawl anything under logs and tmp directory.