Архив блога VadoZ

Архив за Октябрь, 2011

Hrefer unhide the SECRETS!

2nd Октябрь, 2011
Комментарии к записи Hrefer unhide the SECRETS! отключены

This article is written for the article competition which is held by the Botmaster team

Here I will describe you how to look for the trust sites for placing doorways with the help of the programs Hrefer and Xrumer.

The search can be done by two methods:

  • go over a great number of the sites and place doorways on them and watch the result;
  • observe competitors.

We will choose the second method in the nature of laziness.

Let’s do analysis of the result. Let’s make Hrefer collect top 10 according to the queries of the interested topic. The first thing you should do is to decide on the topic. Here I will consider the work with pharma as the most interested direction. Copy a file with the keywords to the directory Words Hrefer. I deliberately did not select the keywords by the popularity because I was trying to provide a maximal inclusion of the niche.

Now we launch Hrefer, go to the menu Options->Parsing options and uncheck the following sections:

  • Convert all links to index (we are interested in the complete addresses of pages)
  • Log founded hight-PR freehostings into the FreeBonus.txt (this option slows down the process)
  • Enable filtering duplicated links by hostnames (it is useful to calculate the number of the hostnames, it will help to conduct a quality analysis).

Then we check Do not use «Additive words» and Disable filtering by Template. And now it is time to mark two new options appeared in the version 3.7.

  • we check Save ‘query -> URL’ info to filename_query.txt (this jack will allow us to find a correspondence between a request and link, and this will bring a more delicate work)
  • we put 1 in the field Deep of parsing (pages) (this option points at the number of pages to be parsed).

It is good that we will parse only one page but Hrefer forms queries in the way that search engines give 100 links for one query, and we need only 10 links. What do we do? We will set Hrefer so that it will get only 10 links per each query. To do this it is necessary to make changes in engines.ini, and there is a very good tool in Hrefer. You should go to tuning-> engines.ini and change the preferences Query mask and Total Pages in accordance with the picture.

Now we palm off the keywords on Hrefer as the database of the words (Words Database), and we indicate a file where all links will be saved (I have pharma_top10.txt) and launch parsing. When this process will be done our file will contain all links to the pages of the interested keys from top 10.

Well, it is impossible to go through this great number of the links using only our hands and eyes. We should think – what are the peculiarities of the profile doorways? The most evident sign is the presence of the following lines in the URL:


So let’s filter the received database by means of the usage of these signs. To get best results we can clear the database from hacked sites adding the .edu and other risky zones to the blacklist. Then we create a text file black.txt in which we will indicate zones causing distrust. We launch Xrumer, go to the tool Filter of the link database, indicate the path to our database and we indicate the file black.txt as the database of the filter, and we write the signs of the profiles in the words-filter. In short, we do as it is shown on the picture)))))

In about several minutes of the parsing I have managed to get this little list:

In general, there is something to do in the evening ))))))).

So we have received a small list of the profile doorways from top 10 of Google. Let’s see several doorways in the browser.

  • There is a point to fill the file black.txt with different white shops, blogs and others in addition to those hacked sites. What will it bring? We can use filter black.txt and view the received result on the cross to catch profiles which do not get into our templates of URL (/user/|/users/|/profile/|/people/|/member/|/members/|profile.php|member.php)
  • There is a point to create a script which will calculate the number of each hostname in our database. The more mentions will be, the more often this host will be in the top. And more often it is in the top, the more interesting to see it. And again – we catch profiles which do not get into our mask of URL. That is why I have recommended to forbid Hrefer to filter the duplicated links. The easiest example of such script on PHP:
    $hosts = file("input.txt");
    for ($i=0;$i< count($hosts);$i++)
      $cu = parse_url(trim($hosts[$i]));
    	$host =  $cu[host];
      if ($zz[$host])
    	$zz[$host] = 1;
    foreach($zz as $k=>$v) echo "$k=>$v <br>";
  • There is a point to analyze queries which lead to the profile top. Remember, during setup of Hrefer we checked Save ‘query -> URL’ info to filename_query.txt? Here it will be useful to us. What does the knowledge about the queries which lead the profile to the top give us? We will be able to estimate the queries with what competitiveness this resource can carry.
  • Normally, it is possible to steal keywords from another’s profile doorways. But you should do it by yourself because I believe that it is bad to steal from colleagues.
  • You can edit engines.ini to collect TOP of a certain region. By the way, concerning regions, if you suddenly have a desire to study a state of the deals on specific languages (for example, Hebrew), it is necessary to be ready to the fact that Hrefer does not get on with UTF-8. However, it can be easily converted from UTF to URI format. An example of the script for conversion of the query into URI format on Pathon is below.
    import codecs
    import urllib
    import sys
        fileObj = codecs.open( "query.txt", "r", "utf-8" )
        u = fileObj.read().split("\n")
        fileObj2 = codecs.open( "query1.txt", "w", "utf-8" )
        for line in u:
            line = urllib.quote(urllib.unquote(line).encode('utf8'),'=&?/')
  • Backlink analysis. At the moment this issue is a complete mess (because of innovations in Yahoo) but the subject is very interesting. The matter is that many pharma-people have their networks of their own guestbook in which they spam only. If it has been possible to find several such guestbook out, we will get to know about the appearance of new trust locations at the stage of spam and not when they will be on the top.
  • Analysis of the profile linking (at the moment they are met rarely). It helps to see other resources on which doorways are made and parse keywords used by competitors.
  • It is possible to filter profiles directly by Hrefer using sieve-filter. On the one hand, it would be easy… On the other hand, I do not use this method because I like to see a complete database of Top on the cross, or even out of the corner of my eye. Sometimes my eye notices very interesting things.
  • It is very interesting to check the selected resources on the server’s response by the tool “Analyzer” of Xrumer. If we get excellent responses from 200 OK, it will mean that the doorway is deleted, probably, by abuse. And it means that an administrator keeps an eye on the resource and it will probably delete our doorways.

What is bad in this scheme?

The bad thing is that using this scheme you are ALWAYS behind your competitors. But this is a very good subject to feel a taste of online money.


It is not obligatory to use the found resources for pharma doorways. You can always find various usage of the trust profile pages. For example, they can be used for less competitive topics such as clothes, or you can use it as a reference donor.

Video tutorial

XRumer ,