Google seeks world of instant translations
Google seeks world of instant translationsMOUNTAIN VIEW, California (Reuters) - In Google Inc.'s vision of the future, people will be able to translate documents instantly into the world's main languages, with machine logic, not expert linguists, leading the way.
Google's approach, called statistical machine translation, differs from past efforts in that it forgoes language experts who program grammatical rules and dictionaries into computers.
Instead, they feed documents humans have already translated into two languages and then rely on computers to discern patterns for future translations.
While the quality is not perfect, it is an improvement on previous efforts at machine translation, said Franz Och, 35, a German who heads Google's translation effort at its Mountain View headquarters south of San Francisco.
\"Some people that are in machine translations for a long time and then see our Arabic-English output, then they say, that's amazing, that's a breakthrough,\" said Och.
\"And then other people who have never seen what machine translation was ... they read through the sentence and they say, the first mistake here in line five -- it doesn't seem to work because there is a mistake there.\"
But for some tasks, a mostly correct translation may be good enough.
Speaking over lunch this week in a Google cafeteria famed for offering free, healthy food, Och showed a translation of an Arabic Web news site into easily digestible English.
Two Google workers speaking Russian at a nearby table said, however, that a translation of a news site from English into their native tongue was understandable but a bit awkward.
FEEDING THE MACHINE
Och, who speaks German, English and some Italian, feeds hundreds of millions of words from parallel texts such as Arabic and English into the computer, using United Nations and European Union documents as key sources.
Languages without considerable translated texts, such as some African languages, face greater obstacles.
\"The more data we feed into the system, the better it gets,\" said Och, who moved to the United States from Germany in 2002.
The program applies statistical analysis, an approach he hopes will avoid diplomatic faux pas, such as when Russian leader Vladimir Putin's translator miffed then German Chancellor Gerhard Schroeder by calling him the German \"Fuehrer.\" The word is verboten in that context because of its association with Adolf Hitler.
\"I would hope that the language model would say, well, Fuehrer Gerhard Schroeder is ... very rare but Bundeskanzler Gerhard Schroeder is probably 100 times more frequent than Fuehrer and then it would make the right decision,\" Och said.
The center of Google's effort looks surprisingly modest. Och shares a spartan office with two others on his team, with little clutter other than a shelf of linguistic books above his desk. That's because the muscle work is performed by machines.
So far, Google is offering its own statistical machine translations of Arabic, Chinese and Russian to and from English at http://www.google.com/language_tools. Third-party software gives access on the site to German and other languages, Och said.
\"So far, the focus is let's make it really, really good,\" Och said. \"As part of a general Google philosophy, once it's really useful and it has impact, then there will be found ways how to make money out of it.\"
Miles Osborne, a professor at the University of Edinburgh, who spent a sabbatical last year working on the Google project, praises Google's effort but sees limitations.
\"The best systems (e.g. Google) can be very good indeed for language pairs such as Arabic-English,\" he said.
But he added software will not overtake humans in expert translations as it has in playing chess; software should be used for understanding rather than polishing documents.
\"It may also be useful when deciding whether to pay a human to do a good job: you could imagine looking at Japanese patent documents and seeing if they are relevant, for example,\" he said.
Google chairman Eric Schmidt also sees broad political consequences of a world with easy translations.
\"What happens when we have 100 languages in simultaneous translation? Google and other companies are working on statistical machine translation so that we can on demand translate everything all the time,\" he told a conference earlier this year.
\"Many, many societies have operated in language-defined communities where they really don't understand and are not particularly sympathetic to other peoples' views because of the barrier of language. We're about to have that breakthrough and it is a huge thing.\"
然后看看Google 自己的MT:
新闻Wednesday 28th March 2007周三2007年3月28日Google turns to real-time language translation 4:49PM, Wednesday 28th March 2007谷歌轮流实时语言翻译下午4点49分,周三2007年3月28日
Google believes that in the not too distant future, people will be able to translate documents instantly into the world's main languages, with machine logic, not expert linguists, taking the strain.谷歌认为,在不太遥远的将来,人们将能够翻译文件,瞬间成为世界的主要语言,同机的逻辑,而不是专家,语言学家,以应变。
Google's approach, called statistical machine translation, differs from past efforts in that it forgoes language experts who program grammatical rules and dictionaries into computers.谷歌的做法,所谓的统计机器翻译系统,不同于以往的努力,因为它放弃语言专家,他们计划语法规则及字典到他们的电脑里。
Instead, they feed documents humans have already translated into two languages and then rely on computers to discern patterns for future translations.相反,它们的饲料证件人类已经译成两种语言,然后依靠电脑来识别模式,为今后的翻译工作。
While the quality is not perfect, it is an improvement on previous efforts at machine translation, said Franz Och, 35, a German who heads Google's translation effort at its Mountain View headquarters south of San Francisco.而素质不完美,但它是一种进步对先前的努力机器翻译说,德国och , 35 ,一个德国人元首谷歌的翻译工作,在其芒廷维尤总部以南的旧金山。
'Some people that are in machine translations for a long time and then see our Arabic-English output, then they say, that's amazing, that's a breakthrough,' said Och. '有些人认为是在机器翻译,在相当长的时间,然后再看看我们的阿拉伯语-英语输出,然后他们说,这是了不起的,这是一个突破, '说och 。
'And then other people who have never seen what machine translation was ... '然后其他的人,从来没有见过什么机器翻译是... they read through the sentence and they say, the first mistake here in line five - it doesn't seem to work because there is a mistake there.'他们通读句子,他们说,第一个错误,在此线五-它似乎没有工作,因为有一个有错误。
But for some tasks, a mostly correct translation may be good enough.但对于一些任务,大多数是正确的翻译可能不够好。
For example, Och showed a translation of an Arabic web news site into easily digestible English.举例来说, och显示一个翻译的一份阿拉伯文网站新闻网站成易于消化的英语。
However, two Russian Google workers told us that a translation of a news site from English into their native tongue was understandable but a bit awkward.不过,有两个俄罗斯谷歌工人告诉我们,一个翻译的新闻网站从英语到自己的母语,是可以理解的,但有点尴尬。
Och, who speaks German, English and some Italian, feeds hundreds of millions of words from parallel texts such as Arabic and English into the computer, using UN and EU documents as key sources. och ,讲德语,英语和一些意大利,饲料亿万换言之,从平行文本,如用阿拉伯文和英文输入电脑,利用联合国和欧盟文件作为主要来源。
Languages without considerable translated texts, such as some African languages, face greater obstacles.语言,没有相当的翻译文本,例如一些非洲语言,面临着更大的障碍。
'The more data we feed into the system, the better it gets,' said Och, who moved to the US from Germany in 2002. '更多的数据,我们的饲料到该系统,更好地得到, '说och ,他们转移到美国,从德国在2002年。
The program applies statistical analysis; an approach he hopes will avoid diplomatic faux pas, such as when Russian leader Vladimir Putin's translator miffed该计划适用于统计分析;做法,他希望能够避免外交人造考绩制度,例如当俄罗斯领导人普京的翻译恼火
then German Chancellor Gerhard Schroeder by calling him the German 'Fuehrer.' The word is verboten in that context because of its association with Adolf Hitler.当时的德国总理施罗德致电他的德国' fuehrer '字是verboten在此背景下,由于其与希特勒。
'I would hope that the language model would say, well, Fuehrer Gerhard Schroeder is ... '我希望我们的语言模式会说,好, fuehrer总理施罗德是… … very rare but Bundeskanzler Gerhard Schroeder is probably 100 times more frequent than Fuehrer and then it would make the right decision,' Och said.非常罕见,但bundeskanzler总理施罗德可能是100倍,更频密的,比fuehrer ,然后才会做出正确的决定, ' och说。
The centre of Google's effort looks surprisingly modest.该中心的谷歌的努力看起来出奇地温和。 Och shares a basic office with two others on his team, with little clutter other than a shelf of linguistic books above his desk. och股的一个基本处与另外两人对他的团队,很少有杂波以外大陆架的语言书籍高于他的办公桌。 That's because the muscle work is performed by machines.这是因为肌肉的工作是由机器。
So far, Google is offering its own statistical machine translations of Arabic, Chinese and Russian to and from English .到目前为止,谷歌是推出了自己的统计机器翻译的阿拉伯语,汉语和俄语和英语 。 Third-party software gives access on the site to German and other languages, Och said.第三方软件,让进入该网站上以德语和其它语言, och说。
'So far, the focus is: let's make it really, really good,' Och said. '至今,其重点是:首先我们要它真的,真的好, ' och说。 'As part of a general Google philosophy, once it's really useful and it has impact, then there will be found ways how to make money out of it.' '作为一个普通谷歌哲学,一旦它真的有用,它的影响,届时将有找到了如何使钱出来它。
Miles Osborne, a professor at the University of Edinburgh , who spent a sabbatical last year working on the Google project, praises Google's effort but sees limitations.英里奥斯本教授在英国爱丁堡大学的,他们花了休假去年工作对谷歌项目,歌颂谷歌的努力,但认为局限性。
'The best systems (e.g. Google) can be very good indeed for language pairs such as Arabic-English,' he said. '最好的系统(如谷歌) ,可以很好的,确实为语文双如阿拉伯语-英语, '他说。
But he added software will not overtake humans in expert translations as it has in playing chess; software should be used for understanding rather than polishing documents.不过他又说,软件将不会超越人类的专家翻译,因为它在下棋;软件应该用来为理解而不是抛光文件。
'It may also be useful when deciding whether to pay a human to do a good job: you could imagine looking at Japanese patent documents and seeing if they are relevant, for example,' he said. '它也可能有助于在决定是否要付出人力,以干好工作:你能想象看,日本专利文献和观望,如果他们有相关的,比如, '他说。
Google chairman Eric Schmidt also sees broad political consequences of a world with easy translations.谷歌主席施密特还认为,广泛的政治后果的一个世界同容易翻译。
'What happens when we have 100 languages in simultaneous translation? '会发生什么时,我们有100种语言同声翻译吗? Google and other companies are working on statistical machine translation so that we can on demand translate everything all the time,' he told a conference earlier this year.谷歌和其他公司正致力于统计机器翻译系统,使我们可以对翻译的需求,一切的一切, '他说,会议在今年早些时候。
'Many, many societies have operated in language-defined communities where they really don't understand and are not particularly sympathetic to other peoples' views because of the barrier of language. '很多,很多社团的运作,在语言定义的社区,他们实在不明白,并没有什么特别同情其他国家人民的意见,因为该屏障的语言。 We're about to have that breakthrough and it is a huge thing.'我们将很快有突破,它是一个庞大的事情。
Adam Tanner, Reuters亚当坦纳,路透社
再来看看人工翻译:
Google新翻译技术:阿拉伯语专家感震惊--------------------------------------------------------------------------------
作者: CNET科技资讯网
CNETNews.com.cn
2007-03-29 09:38:35
CNET科技资讯网 3月29日 国际报道:在Google未来的蓝图当中,利用机器逻辑,而不是语言专家,人们将可以瞬间翻译世界几大主流的语言。
Google采用的方法叫做“统计学机器翻译”(statistical machinetranslation),它和以往那种让语言专家编写规程,词汇 进入计算机的模式不同。 这种方法对人们以前翻译过的两种语言文件进行对比,然后计算机据此为未来的翻译任务进行判别。 Google翻译项目的负责人FranzOch表示,虽然这种技术的翻译效果不是很完美,但它和以前的机器翻译相比,已经是一种进步。
Och说:“一些在机器翻译领域工作许多年的专家在看到我们的阿拉伯语到语言的翻译结果后,非常的震惊,那是一种突破。另外
一些从没有接触过机器翻译的人在前五句中只发现了一个错误,这个错误是原文就存在的,所以并不是大问题。” 但是,对于大部分的翻译来说,还是经过人工再次修改后的翻译效果要更好些。
在本周Google的一次午餐会上,Och翻译了一则阿拉伯语新闻,英文译稿很通顺。 但是,旁边两位俄国工程师却表示,俄语到英文的翻译虽然可以勉强被理解,但还是显得比较的笨拙。
Och本人会说德语,英语和一点意大利语,他已经将数百万的联合国及欧盟的双语翻译文本输入了计算机进行对比分析,象阿拉伯语和英语文本。 很多缺乏译文对照的语言的翻译存在很大的障碍,象非洲国家的一些语言。 Och表示:“对照文本越多,翻译效果越好。”
Google的翻译程序运用了统计学分析方法,Och希望这种方法可以避免出现外交辞令上的失礼,比如,当俄罗斯总统普京德国总理施罗德给打电话时,普京称施罗德是德国“Fuehrer”(德语:元首),普京的翻译顿感恼火,因为,在德语里面,这个词是专指希特勒的,属于禁忌词汇。
Och说:“在我们的系统中,元首施罗德出现的次数绝对要少于德国总理施罗德出现的次数,因此,我们的翻译系统就会采用德国总理的称呼。”
目前,Google在阿拉伯文,中文,俄文以及英文的双向翻译系统中提供了统计学机器翻译功能,地址为:www.google.com/language_tools。
爱丁堡大学的教授MilesOsborne去年曾经参与了Google的翻译系统开发,他称赞了Googele在这方面的努力,但他表示, 机器翻译仍然存在局限性。他说:“Google的系统可以很好的进行语言的对译,比如阿拉伯文与英语。”他同时说,机器翻译无法超越人工翻译的效果。
页:
[1]