这个文章是很久以前看见的了,现在很难找到原文的地址,就只好把原文附在后面了

Google就像一个年轻的猛犸象,虽然已经非常强大,但依然在成长。练好的极度业绩以及对在线广告市场持续增长的期望是她 能够与纳斯达克保持同步的的最大的影响因素。但是我现在要说的是一个Google杀手级的竞争对手方案。你可能知道我似乎对开源有强迫症(例如我的OpenHumanSimpleKDE项目),因此我的主张肯定是基于开源的,或许我会叫他Google@Home

首相让我来解释一下什么是Google@Home,简单地说,Google@Home就是Google的一个复制品,但是他是基于开源和分布式的。现在已经有很多的开源搜索引擎计划了,Apache Lucene是最出名的(他是Nutch的基础和Hadoop Distrubuted Filesystem的子项目)。所以Google@Home也可以是基于这些开源搜索引擎。当然要达到Google的成就还好似有很长的一段路要走。但是重要的是Google@Home是一个分布式、疏散的系统,这就是说我们的桌面计算机可以使用空余的时间来为这个搜索引擎提供支持,这样便可以和Google强大的数据中心一较高下了。这其实并不是什么新概念,SETI@HomeFolding@Home这两个著名的科研项目的核心便是如此,他是Google也是斯坦福大学Folding@Home项目的最大的支持者,Google吧Toolbar的资源也贡献给了这个项目

同Wikiasari的比较
这个新的搜索引擎同Jimmy Wales 的Wikiasari计划是很不同的。Wikiasari来源于wikipedia的支持,他的薄弱之处是他太依赖于人的力量。尽管大规模协调,社区驱动的百科全书计划工作的很好,但是破坏行为依旧存在(即使是在一个可以控制的水平之下)。所以我对他能否在搜索引擎下工作的很好持怀疑态度

为什么要创建开源搜索引擎
尽管你已经清楚的知道了这其中的概念,但可能依然对这背后的动力感到困惑不已。为什么一个组织或者一群松散的人会共同效力于这个计划?为什么有人会把计算机的空余时间贡献出来?有以下几点原因:

1、搜索引擎应该是开放的平台,就像操作系统一样。Alex写过一篇关于图像搜索的文章,以自己为例,试图证明当前的图像搜索结果是多么的差。我的回应是或许可以通过公共信息访问和人脸识别算法来提高搜索质量,比如Riya就在这样做。但是我们无法访问搜索引擎的数据库,而且绝大多数的搜索引擎提供的API是有访问量的限制的

2、需要更好的搜索引擎。协同总能产生更好的结果。假想一下,假如全世界的研究者和Google的竞争对手都为这个系统有所贡献,那产生的“大脑”肯定比Mountain View的那个大。这和现在的微软一样,微软在他的研发中心有着世界最优秀的开发人员,但是 依然不能够和全世界的开发者相比较。这也就是为什么Linux在服务器市场领先,甚至在桌面系统也有所建树的原因。看看Dell和Ubuntu的交易以及3D的桌面吧

3、隐私。作为OpenHuman的发起人,这不应该是我讨论的。但事实上很多人为G’eye的关注而感到恐慌。Google在中国市场上的妥协让那些将自己的杂乱的,但依旧有用的搜索数据提供给Google的人要多加思量了。Matt Cutts最近写了一篇关于Google这么处理隐私的文章,但是我依然有一些问题悬而未决。Google在被传唤的时候交出了大量的数据事件让他饱受批评。

4、持续增长的竞争对手。不是每个都乐意看见Google在纳斯达克的风光的。比如最近的Yahoo,ebay,Microsoft的交易。同时也有很多小的刚刚起步的公司也同样郁闷,Google抢了他们的创意,但是他们没有受到任何回报。例如Google Calendar打碎了30Boxes,kiko的美梦。还有Google Spreedsheet和Toolbar。这样的事情微软在80-90年代干过,sun、hp、IBM就是受害者

谁会创建开源的Google?
很可能是Google,或者是Ask,yahoo这样的竞争对手,或者像Nikla zennstorm和Janus Frisk这样的P2P领域的高手。什么都可能发生,但是要我看,最有可能的就是直接竞争对手组成的攻守同盟。很可能就是一向闭源的微软

还有最后一点,没有什么意思了,就不翻译了(看原文请点击浏览全文)

喜欢本博客?推荐订阅    

英文原文
Google is like a young mammoth, already very strong but still growing. Healthy quarter results and rising expectations in the online advertising space are the biggest factors for Google to keep its pace in NASDAQ. But now let’s think outside the square and try to figure out a Google killer scenario. You may know that I am obsessed with open source (e.g. my projects openhuman and simplekde), so my proposition will be open source based - and I’ll call it Google@Home.

First let me define what my concept of Google@Home is. Briefly, Google@Home is an open source, distributed clone of Google. We already have many open source search engine projects - Apache Lucene (which is composed of Nutch and Hadoop distributed file system sub-projects) being the most credible one. So this Google@Home concept can be based on one of those open source search engines. Of course it will have a long way to go before reaching Google’s utility and reach. But more importantly, Google@Home will be a distributed, decentralized system. What this means is that our desktop computers’ idle time will become a part of this new search engine’s computational power. In effect this allows it to compete with Google’s beefy data centers. This is not a new concept either, SETI@Home and Folding@Home are 2 well known scientific projects that use the same grid computing idea in their cores. Indeed Google itself is the biggest supporter of Stanford University based Folding@Home, by dedicating the resources of their toolbars to this project.
Comparison to Wikiasari

The distributed nature of the engine is what makes it different from Wikipedia co-founder Jimmy Wales’ Wikiasari project, which is an open source wiki-inspired search engine. While Wikiasari’s power may come from Wikipedia, its weakest chain is too much human dependency; the power of masses worked well in the open, community driven encyclopedia project, Wikipedia. But vandalism has still been present - albeit at a manageable level. I’m not sure if this can work so well in search engines though.
Why an open source search engine?

Well the concept is clear, but you may wonder about the motivation behind it - why would anyone, an organization or a loosely formed group of people, unite around such a project; and why would people dedicate their computer’s’ idle time to this? Here are some reasons:

1. A search engine is a platform and should be open, just like operating systems. Do you remember Alex’ post on the image search space? By using himself as an example, he tried to prove how lame current image search engines are. The first comment to his entry was from me, and I told him this problem could be solved with open information access and some face recognition algorithms - just like Riya is trying to do. Well, unfortunately we don’t have open access to search engine databases, all we have is the directory dmoz - which is clearly insufficient. Currently, most search engines APIs lock themselves off at predefined low limits of daily queries.
2. Need for a better search engine - collaborative work can always yield better results. Imagine a system where researchers from all around the world, and Google competitors, would contribute to. This would create a bigger brains trust than the one in Mountain View. This is again similar to what’s happening with Windows today. Microsoft has one of the world’s biggest tech talent pools in their campuses all around the world, but it’s impossible to compete with the whole world! And that’s why Linux is a clear leader in the server space, and keeps leaping forward in the desktop arena too - see latest Dell’s Ubuntu Linux deal and the 3D Linux desktops.
3. Privacy is a big concern - as the founder of openhuman, this argument surely doesn’t apply to me, but it’s a fact that many people are scared by the idea of being watched by the big G’s eyes. And Google’s compromises in the Chinese market have pushed people to think one more time before giving their noisy, but still useful, search history data to Google. Google’s Matt Cutts recently wrote an interesting post on his company’s approach to privacy - but there are still remaining questions in my mind. Google is vulnerable to give up its huge stack of information when presented with subpoenas.
4. Growing number of competitors - not everyone is happy with Google’s rise on NASDAQ. Case in point: the latest Yahoo - Microsoft - eBay partnership deal. Google, instead of creating new markets just like Amazon does with its artificial artificial intelligence projects and S3 - EC2, is competing heavily with Yahoo, eBay, Amazon and Microsoft. Also many startups are unhappy with Google disrupting their business and not rewarding their innovation. The best examples are Google Calendar and the broken dreams of 30 Boxes, Kiko and others. Also Google Spreadsheets and lately the situation with Google Toolbar and StumbleUpon. This was again what happened to Microsoft in the 80’s and 90’s - when they disrupted Sun, IBM, HP and others.

Who would create an open source Google clone?

Perhaps, Google itself. Or Google competitors such as Ask or Yahoo. Also it might be something that P2P kings Niklas Zennstrom and Janus Friisk are up to - besides their Joost project. Everything is possible, but in my opinion the most plausible option would be a joint attack by direct competitors. Indeed perhaps the best fit would be the classic “closed source” company Microsoft!! This could be a mirror response to Google, who up till now has leveraged most of its PR towards Microsoft’s ‘evil’ closed source approach (i.e. the subtle ‘do no evil’ mantra of Google). Stranger things have happened.
Revenues

Another idea, this Google@Home project can make more use of power of masses in its core - Google is still reluctant to use the direct power of masses idea in its search. Yahoo, on the other hand, with their new unified Social Search Unit seems more ambitious in this arena. As a total underdog, Google@Home would be more open to such innovations and could probably profit from these new paradigms.

How could you support this type of search engine with a complementary distributed and open source ad network? Baris Karadogan has more about this in his blog. (I met him at a conference last week and it turned out that surprisingly we hatched and blogged about these similar concepts at the same time!)
Conclusion

Yes. this is my ‘Google killer’ scenario. There are many open questions though - some of them are:

* Is this really feasible (I think yes) - but your technical input is welcome
* Are there any projects already doing this?
* Would it really be a Google killer, or would the user base stay limited to geeks only?

Let us know what you think, and also your ‘Google killer’ scenarios too!

喜欢本博客?推荐订阅    

阅读(427 次)

相关文章