Search engine spiders and your web site
April 24, 2009 by top 10 optimizer
Filed under Search Engine Optimization
Are you sure that search engines understand your web site? Search engines see your web pages with different eyes than web surfers.
A web page that looks great to the human eye can be totally meaningless to search engines. For example, search engines cannot read the text on the images of your web site, and many don’t understand web languages such as JavaScript or CSS.
If you have a great looking web site that is meaningless to search engines, you won’t be able to achieve high search engine rankings with that web site – no matter how good and interesting your web site content is.
In general, search engines cannot see content that is presented in the following file formats:
- images (GIF, JPEG, PNG, etc.)
- Flash movies, flash banners, etc.
- JavaScript and other script languages
- other multimedia file formats
Some search engines can index some of these file formats but in general, it’s very difficult to obtain high search engine rankings if your main web site content is presented only in these formats.
Search engines need text to index your web site. They cannot know what’s written on your GIF or JPEG images or in your Flash movies. If you use a lot of images on your web site, you should also create some web pages that contain a lot of text.
If you want to find out how search engines see your web site, you have to use a search engine spider simulator tool. A search engine spider simulator tool emulates the software programs search engines use to index your web site. They show you what elements of your web site are visible to search engines.
Top10Optimizer’s search engine spider simulator helps you to find out how search engines see your web site. Top10Optimizer’s spider simulator even allows you to emulate special search engines spider names so that you can find out if your competitors send different pages to search engine spiders.
Just enter the URL of your web site and Top10Optimizer’s will tell you what text and which links search engines can find on your site.
That allows you to quickly find out whether your web site lacks information that search engines need to properly index your web site.
see sample of Top10Optimizer’s search engine spider simulator reports
15 things that you can do when your website is not listed
April 22, 2009 by top 10 optimizer
Filed under Search Engine Optimization
15 things that you can do when your website is not listed
First of all, you should make sure you can give the web page for which you want to get high rankings a 100% rating.
If you have a 100% rating, make sure that the following factors don’t destroy your optimization work:
1. Don’t use frames
If at all possible, avoid frames. Many search engines have difficulty with frames and it is very difficult to get high search engine rankings for websites that use frames.
Even Google has difficulty with frames. Here’s Google’s official statement about frames:
“Google supports frames to the extent that it can. Frames can cause problems for search engines because they don’t correspond to the conceptual model of the web. In this model, one page displays only one URL. Pages that use frames display several URLs (one for each frame) within a single page.”
2. Avoid Flash and other multimedia elements
Most search engines cannot index Flash pages. The normal text content on your web pages matters most to search engines. If you must use Flash on your website, make sure that you also offer normal text for the search engines. Text in Flash elements is invisible to search engines.
3. Don’t use welcome pages
Some websites use a “Welcome to our website” image with a link to the actual site as the index page for the website. Don’t do this. Some search engines might not follow the link on the welcome page and your index page won’t contain any useful content for search engines.
In addition, most web surfers don’t like these welcome pages. Your index page should not look like www.zombo.com
4. Choose a reliable hosting service
Your web page should be hosted by a reliable hosting service. Otherwise, it could happen that your web server is down when a search engine spider tries to index it. If your website fails to respond when the search engine’s index software program visits your site, your site will not be indexed.
Even worse, if your website is already indexed and the search engine spider finds that your site is down, you could be removed from the search engine database. It’s essential to host your website on servers that are very seldom down.
5. Choose a fast hosting service
Search engine crawler programs that index Web pages don’t have much time. There are approximately 4-6 billion Web pages all over the world and search engines want to index all of them. So if the host server of your Web site has a slow connection to the Internet, you may experience that your Web site will not be indexed by the major search engines at all.
You may also want to limit the size of your homepage to less than 60K. It will also benefit the still numerous users that connect to the Internet with a slow modem. For even the casual Internet user, the performance of a Web site can make the difference between pleasure and frustration.
6. Take a look at the HTML code of your web pages
Select “View source” in your web browser to take a look at the source code of your website. Some web pages contain so much JavaScript code and other HTML commands that the actual content is hard to find.
If you cannot immediately see the content of your web page when you view the source code, then it’s likely that there is so much additional code in your web pages that search engines stop indexing the page before they come to the actual content. Use external JavaScript code and external CSS code to make your pages as short as possible.
Your HTML code could also contain errors that prevent search engines from parsing your web pages. Use the HTML validator in IBP to check the HTML code of your web page.
7. Don’t even think of tricking the search engines
Don’t use text in the color of your web page background and don’t stuff obscure HTML tags with your keywords. Search engines don’t like to be tricked. If you try to trick search engines, it’s likely that your website won’t be listed.
Google and other major search engines have extra departments that deal with web spam. They will find the spam elements on your website sooner or later.
It is better to design your web pages so that they are beneficial for all: web surfers (who find what they’re looking for), search engines (which get better results) and you (who gets the customers).
8. Don’t use redirections
If the web page you submit contains a redirection to another website, most search engines will skip your website completely. Do not submit a redirection web page. Many webmasters tried to cheat search engines with redirection pages in the past.
The search engines companies discovered that and they decided to totally skip web pages with redirections. Submit the actual web page that contains the content of your site.
9. Avoid dynamically created web pages
Databases and dynamically generated web pages are great tools to manage the contents of big websites. Imagine you’d have to manage the website contents of the New York Times without databases.
Unfortunately, dynamically generated web pages can be difficult for search engine spiders because the pages don’t actually exist until they are requested. A search engine spider is not going to be able to select all necessary variables on the submit page.
Most search engines can index dynamically pages to a point, but even Google states that they have problems with dynamically created pages. Here is Google official statement:
“Yes, Google indexes dynamically generated webpages, including .asp pages, .php pages, and pages with question marks in their URLs.
However, these pages can cause problems for our crawler and may be ignored. If you’re concerned that your dynamically generated pages are being ignored, you may want to consider creating static copies of these pages for our crawler.”
10. Make sure that you allow search engine robots to index your site
Imagine you’re an Internet marketing service company and you keep trying very hard to get top rankings in the search engines for your customer. Even after several weeks, the customer’s website hasn’t been listed in any search engine.
Then you see that your customer blocked the search engine spiders by not properly configuring the robots.txt file. Details about the robots.txt file can be found here. (in related article)
11. Make sure that search engine spiders can access your website
Search engine spiders don’t have the functionality of full-fledged Web browsers such as Microsoft Internet Explorer, Firefox or Opera.
In fact, search engine robot programs look at your Web pages like a text browser does. They like text, text, and more text. They ignore information contained in graphic images but they can read <IMG ALT> text descriptions.
This means that search engine spider programs are not able to use Web browser technology to access your site. If your Web pages require Flash, DHTML, cookies, JavaScript, Java or passwords to access the page, then search engine spiders might not be able to index your Web site.
12. Make sure that your web server returns the correct HTTP status code
Some web servers are not properly configured and they return an error code when someone requests a web page. Although the page is displayed fine your web browser, search engine spiders might receive an error code.
Check your web pages to make sure that your website returns a 200 OK code to search engine spiders.
13. Make sure that search engines can resolve your DNS name
A mistake that novice users often make is to register a domain name (for example, www.mygreat-site.com), and immediately submit the website URL to the search engines. Then they wonder why the search engines didn’t index their site. It could be that they tried, but the domain name was not available yet.
It takes approximately 2-4 days until a domain name becomes active. All Internet access providers must update their records (DNS tables) to reflect new site locations.
The process of updating DNS tables is called propagation. Search engines must also update their DNS tables and until then, the new domain name www.my-great-site.com doesn’t work. So when you register a new domain name, you must wait about 48-72 hours before submittng the domain name to the search engines.
14. Avoid special characters in your URL
Most search engines have problems indexing web pages when their URLs contain special characters. The following special characters are known to be “search-engine-spider-stoppers”:
- ampersand (&)
- dollar sign ($)
- equals sign (=)
- percent sign (%)
- question mark (?)
These characters are often found in dynamically generated Web pages. They signal the search engine crawler program that there could be an infinite loop of possibilities for that page. That’s why some search engines ignore web page URLs with the above characters.
15. Make sure that your website has enough content
If your website consists of only one or two optimized pages it will be difficult to get good search engine rankings. Search engines try to find web pages that offer valuable content to web surfers.
Your website should have at least six pages and each page should have at least 200 words. Search engines need text to index web pages.




