Forgot Password
Pentax Camera Forums Home
 

Reply
Show Printable Version Search this Thread
02-10-2010, 03:46 PM   #1
Veteran Member
falconeye's Avatar

Join Date: Jan 2008
Location: Munich, Alps, Germany
Photos: Gallery
Posts: 6,871
Robots.txt to exclude translated versions

I suggest that the site's robots.txt file be altered in a way that search engines don't include translated versions of the site in their results.

I used to search the site using Google searches like:

"<searchitems> site:pentaxforums.com"

This is no longer possible because zillions of translations come before any meaningful result.

Additionally, it isn't possible to search the English version because, unlike the translated versions, the English version doesn't have a common prefix like, e.g., the German version (using site:pentaxforums.com/forums/de). This leads to the paradox situation where I can search in German but not in English.


Actually, I am no fan of the forum software's translate feature. Everybody can use Google translate to browse the site. But we shouldn't be forced to read autotranslated responses. But this is a minor point. Not being able to search anymore is a major point. And the site's own search function never turned up useful results (for me).

02-10-2010, 03:58 PM   #2
Administrator
Site Webmaster
Adam's Avatar

Join Date: Sep 2006
Location: Arizona
Photos: Gallery | Albums
Posts: 49,781
The whole point of the translation system is to generate more international traffic.

Since you probably use google.de, you might want to add have the hl=en parameter to the search URL, so that translated results are omitted. That should accomplish what you're looking for

Adam
PentaxForums.com Webmaster (Site Usage Guide | Site Help | My Photography)



PentaxForums.com server and development costs are user-supported. You can help cover these costs by donating. Or, buy your photo gear from our affiliates, Adorama, B&H Photo, KEH, or Topaz Labs, and get FREE Marketplace access - click here to see how! Trusted Pentax retailers:

02-11-2010, 02:17 AM   #3
Veteran Member
falconeye's Avatar

Join Date: Jan 2008
Location: Munich, Alps, Germany
Photos: Gallery
Posts: 6,871
Original Poster
QuoteOriginally posted by Adam Quote
The whole point of the translation system is to generate more international traffic.

Since you probably use google.de, you might want to add have the hl=en parameter to the search URL, so that translated results are omitted. That should accomplish what you're looking for :)
Adam, of course I did try everything to make Google usable again on pentaxforums.com.

If I search for "falconeye site:pentaxforums.com" then a Dutch hit is #4. And following have few English entries only. Using google.com and hl=en ...

If you really want to generate traffic this way, then an easy way would be to have all native content under a common root like forums/com

I then could search for native content like this:

falconeye site:pentaxforums.com/forums/com

as I already can search for German content like this:

falconeye site:pentaxforums.com/forums/de

Ironically enough, this yields no hits:

falconeye site:pentaxforums.com/forums/en


This forums/com root could be additional in order not to destroy existing links. However, the site would have to contain a link into forums/com to make search engines travel this part of the namespace.



I am a site donator. I only have limited comprehension if the value of pentaxforums.com is decreased for the purpose to generate more traffic aka revenue.

For me, Google adwords would have less impact than this translation feature destroying searchability.

Thank You for your understanding.

Last edited by falconeye; 02-11-2010 at 02:23 AM.
02-11-2010, 09:59 AM   #4
Administrator
Site Webmaster
Adam's Avatar

Join Date: Sep 2006
Location: Arizona
Photos: Gallery | Albums
Posts: 49,781
Adding /en/ to all urls would kill url consensus, so that's out of the question. Unfortunately the only way you'd be able to search the site normally for you, it seems, would be to set your browser language to en instead of de. Alternatively, have you tried using the google search in our dropdown menu? That's a CSE and it may be more liberal.

02-11-2010, 10:12 AM   #5
Administrator
Site Webmaster
Adam's Avatar

Join Date: Sep 2006
Location: Arizona
Photos: Gallery | Albums
Posts: 49,781
I tried adding the lang:en parameter to the cse searches and that seems to always return native results.

Same should be true of regular google searches; worst case you can also use the advanced search to exclude /de/

I've also been working on a tweak that will automatically display native content rather than translated content if you are logged in and have translations disabled.

Also fyi even if banners are ever added, they won't be visible to site supporters as my philosophy is based on a clutter-free site.
02-11-2010, 10:41 AM   #6
Veteran Member
falconeye's Avatar

Join Date: Jan 2008
Location: Munich, Alps, Germany
Photos: Gallery
Posts: 6,871
Original Poster
QuoteOriginally posted by Adam Quote
Adding /en/ to all urls would kill url consensus, so that's out of the question. Unfortunately the only way you'd be able to search the site normally for you, it seems, would be to set your browser language to en instead of de. Alternatively, have you tried using the google search in our dropdown menu? That's a CSE and it may be more liberal.
My proposal to add /en/ was as an additional URL space, not the primary one.

Just like /de/, /ja/ etc. are additional URL spaces which do exist already.

The browser language has no influence upon the search results. Using hl=en or google.com has a positive effect but there are still many disturbing hits.

There simply are too many English words which remain in translated pages.

QuoteOriginally posted by Adam Quote
I tried adding the lang:en parameter to the cse searches and that seems to always return native results.

Same should be true of regular google searches; worst case you can also use the advanced search to exclude /de/

I've also been working on a tweak that will automatically display native content rather than translated content if you are logged in and have translations disabled.

Also fyi even if banners are ever added, they won't be visible to site supporters as my philosophy is based on a clutter-free site.
Thanks for the hint about the site's google search. Yes, it behaves like using hl=en at google.com.

Still, searching for falconeye returns about 1/2 foreign hits in the top ten.

I cannot exclude /de/ as I would have to exclude ALL language prefixes (I am getting them all which is what makes this so annoying -- it isn't my browser setting or location). There isn't even any /de/ hit in the top ten in the example I am giving...

Maybe, you can tweak the site's own google search to exclude all language prefixes. You know the list of existing prefixes and can do it once for all of us members.

I appreciate your effort to keep the site fun to use and clutter free. My posting is my attempt to help you in this effort.
02-11-2010, 10:48 AM   #7
Administrator
Site Webmaster
Adam's Avatar

Join Date: Sep 2006
Location: Arizona
Photos: Gallery | Albums
Posts: 49,781
That's weird that you're getting all those results. Can you give me a sample search query with ugly results so that you can investigate? Obviously your username alone would likely behave like that.

Also, yes, I do believe that I could get the cse to ignore other languages. I'll have a look at that when I have some free time for technical work.
02-11-2010, 11:09 AM   #8
Veteran Member
falconeye's Avatar

Join Date: Jan 2008
Location: Munich, Alps, Germany
Photos: Gallery
Posts: 6,871
Original Poster
QuoteOriginally posted by Adam Quote
That's weird that you're getting all those results. Can you give me a sample search query with ugly results so that you can investigate? Obviously your username alone would likely behave like that.

Also, yes, I do believe that I could get the cse to ignore other languages. I'll have a look at that when I have some free time for technical work.
Whenever you need to give an example, you'll be short of one.

The last query I used (falconeye wafer yield) behaved ok (3 hits with embedded google search, 11 hits from google.com with 1 foreign hit). Maybe, the language links are now deeper in the site and ranked lower by Google, don't know.

Thanks that you'll have a look at this.

Reply

Bookmarks
  • Submit Thread to Facebook Facebook
  • Submit Thread to Twitter Twitter
  • Submit Thread to Digg Digg
Tags - Make this thread easier to find by adding keywords to it!
english, google, results, search, site, version, versions
Thread Tools Search this Thread
Search this Thread:

Advanced Search


Similar Threads
Thread Thread Starter Forum Replies Last Post
Anyone else afraid of robots? szurinaga General Talk 11 07-13-2010 07:13 PM
Question Forums To Exclude From View emr Site Suggestions and Help 8 05-17-2010 05:19 AM
Error PM translated into English Rense Site Suggestions and Help 7 03-03-2010 03:42 AM
more versions? axl Pentax SLR Lens Discussion 0 06-19-2009 05:06 PM
Exclude own views from view counter dotnik Site Suggestions and Help 2 12-01-2006 12:40 AM



All times are GMT -7. The time now is 12:47 AM. | See also: NikonForums.com, CanonForums.com part of our network of photo forums!
  • Red (Default)
  • Green
  • Gray
  • Dark
  • Dark Yellow
  • Dark Blue
  • Old Red
  • Old Green
  • Old Gray
  • Dial-Up Style
Hello! It's great to see you back on the forum! Have you considered joining the community?
register
Creating a FREE ACCOUNT takes under a minute, removes ads, and lets you post! [Dismiss]
Top