Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Facts about Google and Competition (google.com)
80 points by johnpark on Oct 20, 2013 | hide | past | favorite | 67 comments


It's funny that Google is publishing this. I work in the ad tech industry and Google did this to us 3 months ago, in order to connect to ad exchange

Signed contracts of about 200 pages long, including really long NDAs Took $35k deposit Put us through Dev / Testing Cycle including a few hundred million hits each day for about a month. After spending 3 months on it, they said that we were profiling user data (which we were not - it was an ignorant mistake on our part in checking an item YES in the contract rather than NO). We pleaded, begged and argued with no merit. Google blocked us from accessing their ad exchange ..

In the meanwhile...Google does user profiling openly, mining search queries, Google+, Google Maps and everything else available at their disposal, including the kind of apps that you open on your phones (non android included, thanks to Admob).

It seems to me that they are deliberately blocking any fair competition in the marketplace. If this is not a monopoly, then what is this supposed to be?


"For every search query performed on Google, whether it’s [hotels in Tulsa] or [New York Yankees scores], there are thousands, if not millions of web pages with helpful information."

True.

But there is exactly one Trade Winds Central Inn in Tulsa.

If I look for [hotels in Tulsa] right now, the name of that inn is prominently displayed in the "Hotels in Tulsa on Google" box. Clicking the name will lead not to a page about the inn but to a page of search results where that specific inn is again prominently displayed at the top. Clicking that link will bring a booking page where Expedia, Priceline and multiple others bid on providing booking service for that specific inn.

And if I scroll all the way to the bottom of that page I find a little link: http://www.tradewindstulsa.com/ That's the the one vital link for that one little inn in central Tulsa, Arizona.

The "algorithm" placed the name of the inn on the top of the front page. Then it placed the actual link two clicks deep and at the bottom. The "algorithm" is not stupid...


If you're looking for "hotels in Tulsa", you almost certainly want to book a hotel. Hotel booking sites seems like good search results.

If you're looking for "Trade Winds Central Inn in Tulsa", the first result, after one ad, is www.tradewindstulsa.com/.

Yes, Google displays a lot of ads sometimes. But those sometimes are almost always when you are looking to purchase something, and Google works hard to make sure those ads are things you're interested in purchasing.


Maybe I'm just a cynical, skeptical bastard, but if you have to actively state that you have serious competition, then I can quite reasonably assume that you are, in fact, a total monopoly.

Welcome to the Ministry of Truth, my friends. Let the doublethink engulf your senses, and may the newspeak slip off your tongue, for it is clear that Big Brother is watching, and it appears that he is deathly afraid of the EU. May the DOJ have mercy on his soul.


I work at http://samuru.com we have about 70% unique searches. With out the hordes of default install users who type in the product slogan of every commercial, or try to get the Jeopardy questions before the clock runs out, we don't have a great cache hit ratio.

Combine this with the fact that Google is doing more and more to get Cache Collisions in their results (returning results that don't contain all the words in your search because it deemed word unimportant, or using synonyms) and it is hard to compete on speed.

That's why we don't. We compete on the idea we have better results.


I just tried "temperature princeton tuesday" and didn't get a useful result on both your site and bing.

IMHO it's queries like that make google so much better than the rest.


Our latent search isn't fully rolled out. In some of our other products this works very well.

Samuru we had to balance Speed against all of the latent search we could do. And we do a few tricks to make the latent search not impact the Main search, but Weather is slow when you work in the volume we do because we can't cache every Geo, and National weather service isn't fast.


Interestingly, I searched "android securerandom" on both, and Google has the first result as the javadoc and the second as a blog post explaining a serious vulnerability in it and how to mitigate it.

The same search on samuru has no results about the vulnerability.


Do the search again. We have two indexes, a shallow one for things we have never seen and a deep one for things people have searched for. The results have changed likely since your first search. This would not be an issue if we had only 15% new searches.


Good to see a google alternative. Also, I'm very curious: is this really written in PHP?


No, what gives you that impression? It is python, running on Google AppEngine


ah, it's because I think that I've see the .php exntesion somewhere


Was using Samuru from July - half September this year. While I applaud the way it just searches for what I ask for, no more no less, there was just one major issue that made me turn to DDG: loads too slow. The way I use it is as default search engine in Opera. So to search, I do Ctrl-T, start typing query, hit Enter. After the Enter it sometimes took more than 5 seconds to show the results. I don't know why that is but it's just too long in comparision with others, sorry. It seems better now, but it's still slower than the competition.


We do more. The we just don't have enough user base to get good caching. probably you are the first person to do most of your searches and that slows us down, where as DDG and Google likely have seen a query like yours before.

I like to think that the summaries vs. the snippets make up for some of that, we get you more info in one go.

We also lose a bit because we return about 50 results at a go instead of 10.


Little cosmetic bug, doesn't seem to effect the results — http://i.imgur.com/ttof6iD.png


Will look in to it.


I just tested Samuru vs. Google with the query:

speed of the iss

I think Google clearly won in this case.


Apparently you need "What is the speed of the ISS" to get an answer, though if you're looking for related links, you get http://issondemand.com/ as the first result, which is less... stellar than the result google provides


> I just tested Samuru vs. Google with the query:

why is grass green?

Samuru clearly won in this case


They both appear to have relevant results.


It's funny how a monopoly will pretend they’re not a monopoly while the non-monopolies try to pretend they are.


Sun Tzu - Appear weak when you are strong, and strong when you are weak


Completely off topic, but why can't I keep the page zoomed out far enough to view the entire width of a paragraph? I'm on my iPhone; is Google doing something screwy to fuck with iOS users?


Doesn't align properly on Android/Firefox too


I have the same problem. I also thought Google is punishing me for using an iphone, just like they did with maps ;)


Usually these issues are due more to browser settings...


Safari on iOS doesn't have many settings... The page is just broken.


I mean Safari might be rendering it wrong by default. The fact there are few settings is a limitation which may or may not impact the rendering...

Either way I doubt it's any sort of ploy to fuck with iOS users...


I wonder how many of them are literally "new", as in, "What cures a hang over" versus "What cures a hangover", and how many of them are "new" by the time Google's computer breaks it into a normalized query? I guess to the engineers designing that process, it's all new data that their algorithm has to deal with. But it'd be interesting to see the vitality of the search for new concepts and knowledge among Google's users.


One measure of "newness" is if the top 10 results are the same as other queries. After all, if two queries produce the same list of results, then aren't they the effectively the same query? I doubt that's what's being used here, though.


this may be a stupid question, but what does the link have to do with "Google and Competition"?


This is a classic "made for regulators" set of pages. The title lets regulators know that it's for them, but the content has to explain Google's business as simply and positively as possible. You'll also notice that when you click through each of the pages, almost every criticism that could be lobbed at Google by a 3rd party is countered in some way.

Also note that the page came about mid-2011, around the same time when Microsoft was beating their "Google is a monopoly" drum.

http://web.archive.org/web/20110826063413/http://www.google....


The previous title was:

    Google: "15% of the searches we see everyday we've never seen before."  
The linked page does come from Google's "Facts about Google and Competition", but the section "About Search" appears to be interesting for broader reasons.


This is a response to http://www.fairsearch.org/ which was created by a coalition of Google's competitors to try to convince governments to regulate Google.


Google Search is broken.

Even Verbatim search starts to look fishy.

Search for 'engine oil' and you will find 'health benefits from using fish oil'. I'm tired.


Interesting, I searched for "engine oil" and I don't get anything about fish oil in 5 first pages of results. First two hits are the most relevant as well - wikipedia entries.


Just a poor hyperbole, yet I don't believe you haven't seen these issues before. Just looking now to fix a Wordpress site that a cheap developer left us without any documentation. Search for "wordpress schema explorer", results: Outlook Web Access 2003 doesn't work with Internet Explorer 10 Some SOQL and Salesforce results (can you even turn off personalisation?) Crystal Reports 2008 Database Explorer Doesn't Show All Schemas

Verbatim results: visual studio 2008 - How do you open XML Schema Explorer - Stack ...

There is a single mention of Wordpress in that page: at the page footer.

Also, how about inability to filter by date AND verbatim results?


Anecdotal example of broken queries != broken search.


Especially when the query is easily verifiable as "not broken". It makes me suspect this person's ability to keep accurate statistics.


You get instaupvotes on HN for hating on Google so there is that.


A few years ago it was reported at being 20-25%. http://searchengineland.com/google-25-of-queries-are-new-add...


How would they know this for sure if they are only supposed to be storing searches for 18 months?


A Bloom filter, for example, would permit them to know this without retaining any searches.


Or even statistical sampling.


When we're talking about billions of queries per day, aggregate data is the only kind of data that is worth anything.


I'm sure they are storing term volume indefinitely, considering google.com/trends.


They are anonymizing queries after 18 months, not deleting them.


I feel Google will get replaced soon. The fact that search is getting more personalised is hurting me more than anything else. We dont get unbiased results. The User experience will definitely die down with this !!!


Think out of the box: you most certainly WANT personalised result. What you don't want is biased news


imagine a republican seeing more republican friendly results and a democrat seeing democratic results. Now, this would make Google really look unreliable at some point of time !


Yes, I've seen (as many others did) the filter bubble video, but I'll propose a counter argument: imagine looking for "python" and having results referring both to wildlife and programming languages.

The questions that I asked myself after watching the video are:

What is the expected result?

Are there any logical fallacies that I need to consider?

What are the flaws associated to a point of view?

What are the flaws associated to the other point of view?

Why are they flaws?

What can be done to fix them?

If some things cannot be done, is there any workaround?


see duckduckgo and bubble


I've been trying to ween myself off search engines, and rather search directly from within specific websites:

Wikipedia - general information

StackOverflow - programming

IMDB - actors/movies

etc...

Obviously finding those websites in the 1st place requires a search engine or index of some kind, but I'm getting faster results going directly to the source :)



"we can create computing programs, called “algorithms”"

I don't think that word means what you think it means.


Yeah, it's not very clear or clean, but also take into account the audience this site is aimed at.


The whole 'algorithms do the ranking' thing is a misdirection. Algorithms do of course rank pages, but they do so according to criteria chosen by humans employed by Google. Google absolutely chooses which pages rank highly and which do not according to their own subjective human judgement, applied by machine. It is disingenuous of them to pretend otherwise.


Can you give a concrete example of this? What 'criteria' is chosen by humans? Do you mean stuff like are the key terms in the h1 tags? I would still consider that an algorithm doing the ranking, and PageRank also is the major factor in deciding what gets returned, and though a human wrote the mathematics, they aren't scoring each page individually to quantify the value of each link.


Greatest Living American. Google it. Read how I was the greatest living american, Colbert was, and then an seo firm was. All in the matter of a few weeks, because google changed scores to "hand job" the results.


It feels like something is wrong with your comment. I'm pretty sure that Google does something what looks more like A/B testing than law making process. That basically means that you should have some magic power to guess how adjusting one parameter (I don't even speak about multiple) will affect final result. Could you guess how mathematical attractors will behave by changing some parameters? Are you sure Google hires super geniuses that can guess weather? If they do why Google has not released they weather prognosis service yet that would wipe-out all competition from the scene?


They don't need magic powers. They employ hundreds of human raters to judge the quality of search results according to Google's criteria and biases.


While I wont speak for gibwell, I'm interpreting what is saying that Google do set the criteria of what metrics deem a higher or lower ranking page.

For example:

- More time users spend on a page indicates it's more relevant

- More high quality inbound links indicates higher quality content

- Content seen as spammy are considered can lower page rank

The fact is that Google don't discriminate based on the rules they set out within their algorithm. When Google say their algorithm does the choosing, it does, but it's the same algorithm used across the whole internet, giving a fair competitive landscape. That's the theory at least.


What worries me more is that Google is increasingly choosing to put brands at the top of the search results.

Don't get me wrong, if I'm looking for the Apple website, I should absolutely get Apple.com first in Google, and not a medium-sized blog talking about Apple. So relevancy should matter most.

However, I fear that they are ranking bigger sites and bigger brands above more relevant posts from medium-sized, sites, too. And that I don't like. Give the little guy a chance, especially if his post is of higher quality.

And no I don't think that if the big site's post gets retweeted 100 times and the medium-sized' post gets retweeted 10 times, it should matter much, because resharing is just a side-effect of being big and having a big audience, and I think it's less about the "quality" of the post.

Google ends up promoting bigger and bigger sites at the top, while downranking the smaller ones, who are impacted negatively by a lot of factors (smaller age, fewer backlinks, fewer reshares, etc).

So I guess my point is, on-page "performance" (for lack of a better word) should always count more than off-page performance.


This is just a power law applied to the distribution of inbound links. As time progresses, the gap widens.

Moreover, very frequently a little guy has content on bigger sites and enjoys traffic from being integrated into the bigger site ecosystem. Reasoning about search engine result pages is hard because it is a very complex system with multiple various parameters that are factored in.


I agree that Google uses one rule set across the whole internet. I don't think they have anything resembling a list of competitors to downgrade or anything like that.

However the algorithm is based on hundreds of metrics, and is highly discriminating. In addition, Google uses human raters to determine what a good quality result is. This is fed back into tuning the weighting of the metrics.

Therefore, the biases and judgements of these human employees will be reflected in the results.


Oh no, you are wrong. Machine learning algorithms that are used by Google learn by observing behaviours of users and the results of this learning is essentially a black box of parameters and weights for the target function. Certainly they analyze the data and add some parameters manually, for example, manual review process but most of the work is done in the opaque black box. So your statement that "Google absolutely chooses which pages rank highly and which do not according to their own subjective human judgement" is not accurate.


You say I'm wrong but your argument then supports exactly my position. The 'black box' is trained by humans, and it applies their judgments to the pages it processes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: