搜档网
当前位置:搜档网 › 大基因组大数据与生物信息学英文及翻译

大基因组大数据与生物信息学英文及翻译

大基因组大数据与生物信息学英文及翻译
大基因组大数据与生物信息学英文及翻译

Big Genomic Data in Bioinformatics Cloud

Abstract

The achievement of Human Genome project has led to the proliferation of genomic sequencing data. This along with the next generation sequencing has helped to reduce the cost of sequencing, which has further increased the demand of analysis of this large genomic data. This data set and its processing has aided medical researches.

Thus, we require expertise to deal with biological big data. The concept of cloud computing and big data technologies such as the Apache Hadoop project, are hereby needed to store, handle and analyse this data. Because, these technologies provide distributed and parallelized data processing and are efficient to analyse even petabyte (PB) scale data sets. However, there are some demerits too which may include need of larger time to transfer data and lesser network bandwidth, majorly.

人类基因组计划的实现导致基因组测序数据的增殖。这与下一代测序一起有助于降低测序的成本,这进一步增加了对这种大基因组数据的分析的需求。该数据集及其处理有助于医学研究。

因此,我们需要专门知识来处理生物大数据。因此,需要云计算和大数据技术(例如Apache Hadoop项目)的概念来存储,处理和分析这些数据。因为,这些技术提供分布式和并行化的数据处理,并且能够有效地分析甚至PB级的数据集。然而,也有一些缺点,可能包括需要更大的时间来传输数据和更小的网络带宽,主要。

Introduction

The introduction of next generation sequencing has given unrivalled levels of sequence data. So, the modern biology is incurring challenges in the field of data management and analysis.

A single human's DNA comprises around 3 billion base pairs (bp) representing approximately 100 gigabytes (GB) of data. Bioinformatics is encountering difficulty in storage and analysis of such data. Moore's Law infers that computers double in speed and half in size every 18 months. And reports say that the biological data will accumulate at even faster pace [1]. Sequencing a human genome has decreased in cost from $1 million in 2007 to $1 thousand in 2012. With this falling cost of sequencing and after the completion of the Human Genome project in 2003, inundate of biological sequence data was generated. Sequencing and cataloguing genetic information has increased many folds (as can be observed from the GenBank database of NCBI). Various medical research institutes like the National Cancer Institute are continuously targeting on sequencing of a million genomes for the understanding of biological pathways and genomic variations to predict the cause of the disease. Given, the whole genome of a tumour and a matching normal tissue sample consumes 0.1 T

B of compressed data, then one million genomes will require 0.1 million TB, i.e. 103 PB (petabyte) [2]. The explosion of Biology's data (the scale of the data exceeds a single machine) has made it more expensive to store, process and analyse compared to its generation. This has stimulated the use of cloud to avoid large capital infrastructure and maintenance costs.

In fact, it needs deviation from the common structured data (row-column organisation) to a semi-structured or unstructured data. And there is a need to develop applications that execute in parallel on distributed data sets. With the effective use of big data in the healthcare sector, a

reduction of around 8% in expenditure is possible, that would account for $300 billion saving annually.

下一代测序的引入给出了无与伦比的序列数据水平。因此,现代生物学在数据管理和分析领域面临挑战。单个人类DNA包含约30亿个碱基对(bp),表示约100吉字节(GB)的数据。生物信息学在这种数据的存储和分析中遇到困难。摩尔定律推测,计算机速度增加了一倍,每18个月大小减少一半。报告说,生物数据将以更快的速度积累[1]。人类基因组测序的成本从2007年的100万美元降至2012年的1千美元。随着测序成本的下降,在2003年人类基因组项目完成后,产生了生物序列数据的淹没。测序和编目遗传信息已经增加了许多倍(如从NCBI的GenBank数据库可以观察到的)。诸如国家癌症研究所的各种医学研究机构正在连续地将一百万个基因组的测序用于理解生物学途径和基因组变异以预测疾病的原因。假定肿瘤的全基因组和匹配的正常组织样品消耗0.1 TB的压缩数据,则一百万基因组将需要10万TB,即103 PB(petabyte)[2]。生物学数据的爆炸(数据的规模超过单个机器)使得与其一代相比存储,处理和分析更昂贵。这刺激了云的使用,以避免大的资本基础设施和维护成本。

实际上,它需要从公共结构化数据(行- 列组织)偏移到半结构化或非结构化数据。并且需要开发在分布式数据集上并行执行的应用程序。随着医疗行业大数据的有效利用,支出减少约8%,每年可节省3000亿美元。

Review

Cloud computing

Cloud computing is defined as "a pay-per-use model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction" [3]. Some of the major concepts involved are grid computing, distributed systems, parallelised programming and visualization technology. A single physical machine can host multiple virtual machines through virtualisation technology. Problem with grid computing was that effort was majorly spent on maintaining the robustness and resilience of the cluster itself. Big data technologies now have identified solutions to process huge parallelised data sets cost effectively. Cloud computing and big data technologies are two different things, one is facilitating the cost effective storage and the other is a Platform as a Service (PaaS), respectively。

Three types of clouds are: public cloud, Private cloud and Hybrid cloud. First one refers to resources like infrastructure, applications, platforms, etc. made available to general public, accessible only through Internet on "pay as you go" basis. Second one refers to virtualised cloud infrastructure owned, housed and managed by a single organisation. Third one refer to the connection of private and public, for scalability and fault tolerance via Virtual Private Networking (VPN). A fourth model is also proposed, namely Community Cloud. Here organisations like public sector organisations, having sameinterest, can contribute financially towards a cloud infrastructure.

云计算被定义为“用于实现对可快速供应和释放的可配置计算资源(例如,网络,服务器,存储,应用和服务)的共享池的方便,按需的网络访问的按使用付费模型与最小的管理努力或服务提供者交互“[3]。涉及的一些主要概念是网格计算,分布式系统,并行编程和可视化技术。单个物理机器可以通过虚拟化技术托管多个虚拟机。网格计算的问题是,努力

主要花在维护集群本身的鲁棒性和弹性。大数据技术现在已经确定了以成本有效的方式处理大量并行数据集的解决方案。云计算和大数据技术是两个不同的事情,一个是促进成本有效的存储,另一个是分别是平台即服务(PaaS)。

三种类型的云是:公共云,私有云和混合云。第一种是指向一般公众提供的基础设施,应用程序,平台等资源,只能通过互联网以“按需付费”的方式访问。第二个是指由单个组织拥有,安置和管理的虚拟化云基础设施。第三种是指私有和公共连接,通过虚拟专用网(VPN)实现可扩展性和容错。还提出了第四个模型,即社区云。这里的组织像公共部门组织,具有相同的兴趣,可以贡献财务到云基础设施。

Genomics through big data technologies

With the implementation of big data technologies in storing, processing and analysing genomics data of medical research can profoundly impact mankind. Timely processing of data, and subsequent analysis are still a challenge. Solutions could be implementation of leading big data technologies like Hadoop. There have been studies regarding the utilisation of Apache Hadoop platform in bioinformafics projects [4].

随着大数据技术在存储,处理和分析基因组学数据的实施,医学研究可以深刻地影响人类。及时处理数据,以及随后的分析仍然是一个挑战。解决方案可能是实施领先的大数据技术,如Hadoop。已经有关于Apache Hadoop平台在生物信息学项目中的利用的研究[4]。

Bioinformatics tools developed

MapReduce projects [5]

Crossbow project [6]

B1astReduce project [7]

C1oudBurst [8]

CrossBow [9}

Cloudera

Cloudera, being the service provider in the big data platform is the leading Apache Hadoop software. It is contributing >50% of its output into open source (Apache licensed))projects, drawing a cutting edge in the development of big data technology and the Hadoop framework. It was established by Google, Yahoo and Facebook leading engineers along with an Oracle executive, who were later joined by the founder of Apache Hadoop project. [3] Cloudera is a pioneer of big data and cloud computing in the biomedical researches. The chief scienfist and the co-founder of Cloudera, is aiming to dedicate 25% of their time towards the use of computational biology in genomics [10]. Hence, leading pioneers of big data and computational biology along with leading multinationals are now committing to aid medical discoveries through contribution towards analysis of large biological data, for the understanding, diagnosis and treatment of diseases. In fact, this is the need of the hour, because the annual growth for healthcare computing is going to be around 20.5% through 2017 [11].

Cloudera作为大数据平台中的服务提供商,是领先的Apache Hadoop软件。它将其50%的输出贡献给开源(Apache许可)项目,在大数据技术和Hadoop框架的开发中占据了前沿。它由Google,雅虎和Facebook领先的工程师与Oracle高管共同建立,后来他们被Apache Hadoop项目的创始人加入。[3]

Cloudera是大数据和云计算在生物医学研究领域的先驱。Cloudera的首席科学家和联合创始人,旨在将25%的时间用于在基因组学中使用计算生物学[10]。因此,大数据和计算生物学的领先先驱与领先的跨国公司现在承诺通过对大型生物数据的分析,疾病的理解,诊断和治疗的贡献,帮助医疗发现。事实上,这是小时的需要,因为到2017年,医疗保健计算的年增长率将达到20.5%左右[11]。

Hadoop

Two key modules: i) MapReduce ii) Hadoop Distributed File System (HDFS)

1. A computational program is divided into many small sub-problems. Distributed on multiple nodes of the computer.

2. A distributed file system for storing data on these nodes.

Such softwares are designed for load balancing among different nodes and allowing distributed processing of large datasets, enabling fault-tolerant parallelized analysis. Bioinformatics cloud involve services like data storage, acquisition, analysis, etc. as the cloud platform delivers hosted services over the Internet. It could be categorized into four categories namely, Data as a Service, Software as a Service, Platform as a Service, and Infrastructure as a Service [12-16}.

两个关键模块:i)MapReduce ii)Hadoop分布式文件系统(HDFS)

计算程序被分为许多小的子问题。分布在计算机的多个节点上。

2.用于在这些节点上存储数据的分布式文件系统。

这样的软件被设计用于在不同节点之间的负载平衡,并允许大型数据集的分布式处理,使得容错并行化分析成为可能。生物信息云涉及数据存储,采集,分析等服务,因为云平台通过Internet提供托管服务。它可以分为四类:数据即服务,软件即服务,平台即服务和基础设施即服务[12-16]。

Data as a service (DaaS)

Bioinformatics clouds are dependent on data for downstream analyses. "It is reported that annual worldwide sequencing capacity is beyond 13 Pbp and on an increase by a factor of five every year" [17]. Due to this unrevealed explosion of data, Data as a Service (DaaS) delivery via Internet has gained importance. It provides dynamic data access on demand, along with up-to-date data access to a wide range of devices, connected over the Web.

Amazon Web Services (AWS) provide a centralized cloud of public data sets (e.g. archives of GenBank, Ensembl databases, 1000 Genomes,Model Organism Encyclopedia, Unigene, etc.) of biology,economics, etc. as services [18}.

生物信息学云取决于下游分析的数据。“据报道,全球每年的测序能力超过13 Pbp,每年增加5倍”[17]。由于这种数据泄露的爆炸式增长,通过因特网的数据即服务(DaaS)交付已变得越来越重要。它可根据需要提供动态数据访问,以及通过Web连接的各种设备的最新数据访问。

亚马逊网络服务(AWS)提供生物学,经济学等作为服务的公共数据集(例如GenBank,Ensembl数据库,1000基因组,模型生物百科全书,Unigene等的归档)的集中云。

Software as a service (SaaS)

SaaS delivers a large variety of software services online for different types of data analysis facilitating remote access of various heavy bioinformatics softwares. Thus,

it eliminates the need for local installation, thereby easing software maintenance. Up-to-date cloud-based services for bioinformatic data analysis has made life easy for the users.

Efforts have been made to develop cloud-scale and cloud-based sequence mapping [19], multiple sequence alignment [20], expression analysis [21], identification of epistatic interactions of SNPs (single nucleotide polymorphisms) [22], and NGS (Next-Generation Sequencing).

SaaS在线提供各种各样的软件服务,用于不同类型的数据分析,便于远程访问各种重型生物信息学软件。因此,它消除了对本地安装的需要,从而简化软件维护。最新的基于云的生物信息数据分析服务为用户带来了轻松的生活。

已经开发了云尺度和基于云的序列作图[19],多重序列比对[20],表达分析[21],SNPs (单核苷酸多态性)上位相互作用的鉴定[22]和NGS 下一代测序)。

Platform as a service (PaaS)

PaaS allow users to develop, test and use cloud applications in an environment where computer resources scale to match application demand automatically and dynamically. This scalability factor helps in developing applications for biological data.

Two PaaS platforms:

1. Eoulsan, cloud-based-for high-throughput sequencing analyses [23];

2. Galaxy Cloud, cloud-scale-for large-scale data analyses [24].

PaaS允许用户在计算机资源自动和动态地扩展以匹配应用程序需求的环境中开发,测试和使用云应用程序。这种可扩展性因素有助于开发生物数据的应用程序。

两个PaaS平台:

1. Eoulsan,基于云的高通量测序分析[23];

2. Galaxy Cloud,云规模- 用于大规模数据分析[24]。

Infrastructure as a service (IaaS)

IaaS delivers all kinds of resources (virtualized) including CPU (hardwares), OS (softwares) etc. summing up a full computer infrastructure, reaching to the full potential of computer resources via Internet. Virtualized resources can be accessed as a public utility by users and thereby paying for the cloud resources that they utilize. Flexibility and customization give freedom to different users to access different cloud resources, as per their requirement, thus meeting the customized needs of different users.

Examples:

1. Cloud Bio Linux is a virtual machine that is publicly accessible for high-performance bioinformatics computing [25].

2. Clo VR is a portable virtual machine that incorporates several pipelines for automated sequence analysis [26].

IaaS提供各种资源(虚拟化),包括CPU(硬件),操作系统(软件)等等,总计完整的计算机基础设施,通过互联网充分发挥计算机资源的潜力。虚拟化资源可以作为用户的公用设施访问,从而为他们使用的云资源付费。灵活性和定制使得不同用户可以根据自己的需求访问不同的云资源,从而满足不同用户的定制需求。

例子:

1. Cloud Bio Linux是一个可以高性能生物信息学计算公开访问的虚拟机[25]。

2. Clo VR是一种便携式虚拟机,它包含了几个用于自动序列分析的管道[26]。

Bioinformatics cloud

Data in the cloud

Initial method of analysis involve downloading of data from NCBI, Ensembl, etc. and installation of softwares locally on in-house computers. Placing data and loading softwares in cloud, make a way to deliver them as DaaS or SaaS. Both can be seamlessly integrated into cloud. thus, storing of biological data achieves the aim of big data analysis within the cloud. We are using conventional biological databases instead of cloud based. But, for larger sequencing projects, generating ultra-large volumes of data, would require cloud for big data analysis and sharing [27,28]. Project like Genome 10K, 1001 Genomes Project, 1KITE, TCGA etc., are similar kind of projects requiring big data analysis, where solutions of complex biological queries involves utilization of big data tools [29].

初始分析方法涉及从NCBI,Ensembl等下载数据,并在本地计算机上安装软件。在云中放置数据和加载软件,使其成为DaaS或SaaS。两者都可以无缝集成到云。因此,生物数据的存储实现了云内大数据分析的目的。我们使用传统的生物数据库而不是云。但是,对于更大的测序项目,生成超大量的数据,将需要云进行大数据分析和共享[27,28]。像Genome 10K,1001 Genomes Project,1KITE,TCGA等项目是类似的需要大数据分析的项目,其中复杂生物查询的解决方案涉及大数据工具的利用[29]。

Transferring big data

the bottleneck of cloud computing is the transfer of data into cloud. Instead of physically shiping hard drives to the cloud center, a promising solution could be the integration of innovative transferring technologies with cloud computing. One is cloud-based Easy Genomics for high speed genomic data transfer. there was a successful event of transferring genomic data across Pacific Ocean at a rate of about 10 Gigabits per second which proved technologies to be capable of dealing with big data over the Web. Apart from this, there are technologies like data compression and Peer-to-Peer (P2P) data distribution to aid big data transfer [30].

云计算的瓶颈是将数据传输到云中。而不是将硬盘驱动器物理运送到云中心,一个有前途的解决方案可能是将创新的传输技术与云计算集成。一种是基于云的Easy Genomics,用于高速基因组数据传输。有一个成功的事件,以大约10吉比特每秒的速度跨太平洋传输基因组数据,这证明了技术能够通过网络处理大数据。除此之外,还有诸如数据压缩和对等(P2P)数据分发等技术来帮助大数据传输[30]。

Cloud-based programming

the analysis task is implemented as pipeline through linkages between the outputs of tools with the inputs of other tools, to automate the system. Development of customized pipelines is needed for the large-scale automated and configurable data analysis on a cloud-based environment.

Similar programming paradigm is adopted through Hadoop, where a single task is distributed over multiple nodes. Computational skills are required for the development of

cloud-based pipelines in Hadoop without the requirement of extensive coding, rather the setting up a system for data exchange to pave the way for programming environment [31].

分析任务通过工具输出与其他工具的输入之间的联系来实现为管道,以使系统自动化。需要开发定制管道以在基于云的环境上进行大规模自动化和可配置的数据分析。

通过Hadoop采用类似的编程范例,其中单个任务分布在多个节点上。在Hadoop中开发基于云的管线需要计算技能,而不需要大量编码,而是建立一个用于数据交换的系统为编程环境铺平道路[31]。

Bioinformatics cloud

Presently, the biggest cloud provider is Amazon, providing commercial clouds for big data processing. Google is another provider allowing users to develop web applications and analyse data. there is more to be done with commercial clouds to provide ample data and software, along with keeping pace of the emerging needs of researches, which require customized clouds for bioinformatics analysis. Open access and public availability of data and software are of equal significance [32]. the availability of the cloud publicly to the scientific community is essential when data and softwares are in cloud [33]. It ensures data integration, reproducible analyses, maximum scope for sharing.

目前,最大的云提供商是亚马逊,为大数据处理提供商业云。Google是另一个供应商,允许用户开发网络应用程序和分析数据。还需要做更多的工作来提供充足的数据和软件,以及保持研究的新兴需求的步伐,这需要定制云的生物信息学分析。开放获取和数据和软件的公共可用性同等重要[32]。当数据和软件在云中时,云对科学界公开的可用性是至关重要的[33]。它确保数据集成,可重复分析,最大范围的共享。

Potential Challenges

Genomics researches with enormous amounts of data has recognized the potential benefits of moving to the cloud, but at the same time cloud computing raises some concerns as well. The optimization of the genomics analysis for the cloud has provided efficient and timely services. For instance, data can be easily run from sequencing facility to analysis pipeline on the cloud, as it is generated. However, there is need to be aware of various potential challenges in adopting cloud computing technologies.

Hadoop programming requires a high level of Java expertise; it needs to be simplified to a SQL like interface to generate parallelized programs. Standardisation of reporting and summarisation of results is a problem which is not much addressed; need is to develop better analytics and visualisation technologies. Hadoop with no front end visualisation is difficult to set, use and maintain; efforts are being made towards introducing developer friendly management interfaces instead of shell/command line interfaces.

Considering the scale of the genomic data that needs to be transmitted over internet, it takes considerably large amount of time (might extend to weeks at times). thus, the rate of transfer of data remains a bottleneck of the technology [36]. Data tenancy is another challenge. Mostly clouds provide lesser capability on data and service interoperability, making it difficult for a customer to move data and services back to an in-house IT environment or to migrate from one provider to another. Moreover, data

privacy legislation, legal ownership and responsibility pertaining to data stored between international zones points at another challenge [37]. Nevertheless, genomics and proteomics research projects for sure exhibit the applications for next generation cloud based computational biology and it essentially has the potential to revolutionise the pace of research in life sciences.

具有大量数据的基因组学研究已经认识到移动到云的潜在好处,但同时云计算也引起了一些关注。云的基因组分析的优化提供了高效和及时的服务。例如,数据可以容易地从测序设备运行到云上的分析流水线,因为它是生成的。然而,需要了解采用云计算技术的各种潜在挑战。

Hadoop编程需要高水平的Java专业知识;它需要简化为类似SQL的接口来生成并行程序。标准化报告和总结结果是一个没有得到很多解决的问题;需要开发更好的分析和可视化技术。Hadoop没有前端可视化是很难设置,使用和维护;正在努力引入开发者友好的管理接口而不是shell /命令行接口。

考虑到需要通过因特网传输的基因组数据的规模,需要相当大量的时间(可能延长到几个星期)。因此,数据传输的速率仍然是该技术的瓶颈[36]。数据租赁是另一个挑战。大多数云对数据和服务互操作性提供较少的能力,使得客户难以将数据和服务移回到内部IT环境或从一个提供商迁移到另一个。此外,数据隐私立法,法律所有权和与存储在国际区域之间的数据有关的责任指出了另一个挑战[37]。然而,基因组学和蛋白质组学研究项目肯定会展示下一代基于云的计算生物学的应用,它本质上有可能改变生命科学研究的步伐。

Security

Privacy and confidentiality is something that is must to maintain especially when dealing with health information. Cloud computing offers the use of data encryption, password protection, secure data transfer, processes’ audits, and the implementation of respective policies against data breeches and malicious use [34]. the involvement of an external entity for data storage and processing services offers added security concerns. Logging access to the data, role-based access, third party certifications, computer network security, notification alarms, change trackers, cloud usage term and associated services are made to address these concerns [35].

隐私和保密是在处理健康信息时必须保持的。云计算提供了数据加密,密码保护,安全数据传输,流程审计以及针对数据流量和恶意使用实施相应策略的使用[34]。外部实体参与数据存储和处理服务提供了额外的安全问题。记录对数据的访问,基于角色的访问,第三方认证,计算机网络安全,通知报警,变化跟踪,云使用期限和相关服务,以解决这些问题[35]。

Future in microbiology research

Petabytes of raw information can revolutionize microbiology research if we are successful to figure out how to use this gold mine. Winston Hide says “In the last five years, more scientificc data has been generated than in the entire history of mankind”. Today the data generation is light-years faster that it was just a few years ago and thus we can’t imagine the amount of digital information available to us now. Like to study respiratory disease we require capturing huge quantities of data for air quality and then match it with equivalently large datasets, are studies which involve big data. We need to engage lots of eyes in this process.

如果我们成功地想出如何使用这个金矿,那么几百亿的原始信息可以革命微生物研究。温斯顿·史密斯说:“在过去五年里,生成的科学数据比人类整个历史更多。今天,数据生成比仅仅几年前的光年快,因此我们无法想象我们现在可用的数字信息量。像研究呼吸系统疾病一样,我们需要捕获大量的空气质量数据,然后将其与等量的大数据集相匹配,是涉及大数据的研究。我们需要在这个过程中吸引大量的眼睛。

Conclusion

Cloud computing has seen a lot of hype and excitement recently but in the biotech industry it is gradually getting a recognition as a serious alternative to the hardware infrastructures already existing. Parallel DNA sequencing generates massive amount of data, and its interdisciplinary nature employ cloud computing and big data technologies in life sciences. It facilitates high throughput analytics allowing users to interrogate vast data in no time. Metagenomics, systems biology and protein structure prediction require extensive use of big data technology[12]. Metagenomics as a result of genomics revolution gave way to the sequence based analysis of the microbiome(i.e. microbial genomes), which is going to be several orders of magnitude bigger [13]. Try counting total no. of bacterial cells on earth; must be in the range of 1030, most of them still unidentified.the discovery of novel genes encode new proteins whose structure and function needs to be characterised [14].

Next generation cloud based computational biology has the potential to revolutionise life sciences. Cloud-based resources classified as DaaS, SaaS, PaaS and IaaS bears great promises in addressing big data analysis, developing variety of services for data storage, acquisition and analysis by integration of data and softwares, as efficient high-speed transfer technologies to aid the transfer of big data. It provides a light programming environment along with develop customized pipelines publicly accessible to the whole scientific community. Despite existing challenges yet to overcome, the potential advantages that these technologies can bring to the genomic research far outweigh the disadvantages.

云计算近来已经引起了大量的炒作和兴奋,但在生物技术行业,它逐渐被认为是已经存在的硬件基础设施的一个严重替代品。平行DNA测序产生大量的数据,其跨学科性质在生命科学中采用云计算和大数据技术。它促进高吞吐量分析,允许用户在没有时间查询大量数据。宏基因组学,系统生物学和蛋白质结构预测需要广泛使用大数据技术[12]。宏基因组学作为基因组学革命的结果让位于基于序列的微生物组(即微生物基因组)的分析,这将是几个数量级更大[13]。尝试计数总数。的地球上的细菌细胞;必须在1030的范围内,其中大多数仍然是未鉴定的。新基因的发现编码新的蛋白质,其结构和功能需要被表征[14]。

下一代基于云的计算生物学有可能改变生命科学。分类为DaaS,SaaS,PaaS和IaaS的基于云的资源在解决大数据分析,开发各种服务用于数据存储,通过集成数据和软件获取和分析,作为高效的高速传输技术来帮助传输大数据。它提供了轻量级的编程环境,并开发了可供整个科学界公开访问的定制管道。尽管存在着尚待克服的挑战,但这些技术可能为基因组研究带来的潜在优势远远超过了缺点。

常用金融英语词汇的翻译知识讲解

常用金融英语词汇的 翻译

常用金融英语词汇的翻译 acquiring company 收购公司 bad loan 呆帐 chart of cash flow 现金流量表 clearly-established ownership 产权清晰 debt to equity 债转股 diversity of equities 股权多元化 economy of scale 规模经济 emerging economies 新兴经济 exchange-rate regime 汇率机制 fund and financing 筹资融资 global financial architecture 全球金融体系 global integration, globality 全球一体化,全球化 go public 上市 growth spurt (经济的)急剧增长 have one's "two commas" 百万富翁 hedge against 套期保值 housing mortgage 住房按揭 holdings 控股,所持股份 holding company 控股公司 initial offerings 原始股 initial public offerings 首次公募 innovative business 创新企业 intellectual capital 智力资本 inter-bank lending 拆借 internet customer 网上客户 investment payoff period 投资回收期 joint-stock 参股 mall rat 爱逛商店的年轻人 means of production 生产要素 (the)medical cost social pool for major diseases 大病医疗费用社会统筹mergers and acquisitions 并购

大数据单位的换算与翻译

大数据单位的换算与翻译 近几年来,大数据这个词越来越频繁地出现在各种媒体文章上,出现各行各业人士的口中。人工翻译的行业也难免受其影响。在这方面,赛迪翻译亦深有体会。 首先也是最为重要的一点是大数据方面的词语频繁出现。 以往,我们说数据大小,常常使用的单位是MB、GB,而现在我们经常会看到TB、PB、EB、ZB、YB、BB、NB、DB。身为翻译人员,不免要弄清楚这些单位的大小和译法。 从大小方面,这几种单位依然延续了1024 的进制。即,后一个单位是前一个单位的1024 倍。在此,赛迪翻译总结了这些大数据单位。具体的大小如下: 1KB (Kilobyte) = 1024B ,即2 的10 次方字节,读音千字节 1MB (Megabyte) = 1024KB,即2 的20 次方字节,读音兆字节 1GB (Gigabyte) = 1024MB,即2 的30 次方字节,读音吉字节 1TB (Terabyte) = 1024GB,即2 的40 次方字节,读音太字节 1PB (Petabyte) = 1024TB,即2 的50 次方字节,读音拍字节 1EB (Exabyte) = 1024PB,即2 的60 次方字节,读音艾字节 1ZB (Zettabyte) = 1024EB,即2 的70 次方字节,读音泽字节 1YB (Yottabyte) = 1024ZB,即2 的80 次方字节,读音尧字节 1 BB (BrontoByte)= 1024 YB,即 2 的90 次方字节,读音波字节 1NB (NonaByte) = 1024BB,即2 的100 次方字节,读音诺字节 1DB (DoggaByte) = 1024NB,即2 的110 次方字节,读音刀字节

金融专业名词翻译

金融专业名词翻译(一) 一级市场 primary market 一致行动 acting in concert 一般性授权 general mandate 一般谘询及支援工作小组 General Advisory and Support Working Team 一般豁免 general waiver; blanket waiver 一般权证 plain vanilla warrant 一线多机 multiple-work stations 一篮子指数买卖盘 index basket order 一篮子货币 basket of currencies; currency basket 一篮子备兑证;一篮子权证 basket warrant 九龙证券交易所

The Kowloon Stock Exchange 金融专业名词翻译(二) 上下限协议ceiling-floor agreement 上市listing; flotation 上市(复核)委员会Listing (Review) Committee 上市上诉委员会Listing Appeals Committee 《上市公司董事指引》Guide for Directors of Listed Companies 《上市公司董事进行证券交易的标准守则》Model Code for Securities Transactions by Directors of Listed Issuers 上市公司资料库Primary Market Database 上市事宜谅解备忘录补篇(附件一) First Supplement to the Addendum to Memorandum of Understanding Governing Listing Matters 上市事宜谅解备忘录补篇(重订版) Amended and Restated Addendum to Memorandum of Understanding Governing Listing Matters 上市协议Listing Agreement 上市委员会Listing Committee 上市法团listed corporation 上市后的收购活动post-listing acquisition 上市规则listing rules 上市开放式基金listed open-end fund (LOF)

双语:中国姓氏英文翻译对照大合集

[ ]

步Poo 百里Pai-li C: 蔡/柴Tsia/Choi/Tsai 曹/晁/巢Chao/Chiao/Tsao 岑Cheng 崔Tsui 查Cha 常Chiong 车Che 陈Chen/Chan/Tan 成/程Cheng 池Chi 褚/楚Chu 淳于Chwen-yu

D: 戴/代Day/Tai 邓Teng/Tang/Tung 狄Ti 刁Tiao 丁Ting/T 董/东Tung/Tong 窦Tou 杜To/Du/Too 段Tuan 端木Duan-mu 东郭Tung-kuo 东方Tung-fang F: 范/樊Fan/Van

房/方Fang 费Fei 冯/凤/封Fung/Fong 符/傅Fu/Foo G: 盖Kai 甘Kan 高/郜Gao/Kao 葛Keh 耿Keng 弓/宫/龚/恭Kung 勾Kou 古/谷/顾Ku/Koo 桂Kwei 管/关Kuan/Kwan

郭/国Kwok/Kuo 公孙Kung-sun 公羊Kung-yang 公冶Kung-yeh 谷梁Ku-liang H: 海Hay 韩Hon/Han 杭Hang 郝Hoa/Howe 何/贺Ho 桓Won 侯Hou 洪Hung 胡/扈Hu/Hoo

花/华Hua 宦Huan 黄Wong/Hwang 霍Huo 皇甫Hwang-fu 呼延Hu-yen J: 纪/翼/季/吉/嵇/汲/籍/姬Chi 居Chu 贾Chia 翦/简Jen/Jane/Chieh 蒋/姜/江/ Chiang/Kwong 焦Chiao 金/靳Jin/King 景/荆King/Ching

金融英语翻译范文

金融英语翻译范文

China raises interest rates to slow inflation The People's Bank of China, the central bank, raised key savings and lending interest rates from Sunday, March 18, the third time in 11 months in a bid to curb inflation and asset bubbles in the world's fastest-growing major economy. The one-year benchmark lending rate will be raised to 6.39 percent from 6.12 percent, and the one-year deposit rate will be increased to 2.79 percent from 2.52 percent, according to a statement on the bank's website (https://www.sodocs.net/doc/0610577596.html,) . Central bank Governor Zhou Xiao chuan is concerned that cash from a record trade surplus is stoking excess investment, raising the risk of accelerating inflation and boom-and-bust cycles in asset prices. Zhou has resisted calls from Europe and the US to let the Yuan strengthen at a faster pace, making China's exports more expensive. The central bank said, said in a statement posted on its website, that this interest rates adjustment will be conducive to the rational growth of credit and investment; conducive to maintaining a stable price level; conducive to the steady operation of the financial system; conducive to the balanced economic growth and structural optimization, and conducive to promoting sound and fast growth of the national economy. "The data released in the past week suggests that the economy is not actually slowing and that the government is becoming quite

大数据时代英语演讲

Hello, everyone. As we all know, we are now living at the age ofbig data, which leads a revolution that transforms how we think. But many people have half know of big data, they haven’t adopted to the tremendous change. So today I’d like to talk about the three peculiarities of big data. I will be very glad if my speech could help you. Firstly, all samples rather than sampling analysis. With the development of technology, we are able to process massive data, from which we can get more reliable result than through sampling analysis. So we can give up sampling analysis in most cases. Secondly, efficiency rather than accuracy. At the age of big data, we concern more on efficiency rather than accuracy. In other words, we should allow faults to improve efficiency. Comparing with massive data, some faults do not influence the final result. Thirdly, correlation is as important as causality. We can’t deny the importance of causality ,but sometimes it is difficult or unnecessary to explore it. For example, the recommendation system of Amazon doesn’t know why the customer who likes Hemingway’s works is likely to buy Fitzgerald’s works, it just recommends and helps Amazon sell more than 100 times books than before Amazon using it. From this example, we can conclude that correlation plays an important role at the age of big data,so please don’t overlook it.

金融专业英语及翻译

Opposite指“位置、方向、地位、性质、意义等对立的、相反的”, 如: 如: “True” and “ false ” have opposite meanings. “真”与“假”有着相反的意思。 Contrary指“两物朝相反的方向发展”, 含有“互相冲突, 不一致”的意思, 如: Your plan is contrary to mine. 你的计划与我的相反。 Inverse 颠倒的;倒数的 Evil is the inverse of good. Reverse 反过来,翻转 He reversed the car. 他倒车. 教育类 素质教育 education for all-round development 应试教育 the examination-oriented education 义务教育 compulsory education 片面追求升学率 place undue emphasis on the proportion of students' entering school of a higher level 高分低能 good scores but low qualities 扩招 expand enrollment 教书育人 impart knowledge and educate people 因材施教 teach students according to their aptitude 提高身心素质 improve the health and psychological quality 大学生创业 the university students' innovative undertaking 社会实践 social practice 文凭 diplomas and certificates 复合型人才 interdisciplinary talents 文化底蕴 the rich cultural deposits 适应社会的改变 adjust to the social changes 满足社会的急需 meet the urgent needs the society 工作类 人才流动和双向选择 talent flow and a dual-way selection 试用期 probationary period 跳槽 job-hopping 自由职业 freelance work 拜金主义 money worship 获得名利 achieve fame and wealth 充分发挥个人的潜力 develop fully one's potential and creativity 工作出色 excel in one's work 社会和个人的尊重 social and personal esteem 生计问题 a bread and butter issue 人才交流 talents exchange 培养人才 cultivate talents 人才外流 brain drain 失业问题 unemployment problems 下岗职工 the laid-off workers 自谋生路 be self-employed 劳动力短缺 shortage of manpower 医药卫生类 卫生环境 sanitary environment 营养不良 malnutrition

大英文翻译

001 不忘初心,牢记使命。 Remain true to our original aspiration and keep our mission firmly in mind. 002 这是我国发展新的历史方位。 This is a new historic juncture in China’s development. 003 新时代我国社会主要矛盾是人民日益增长的美好生活需要和不平衡不充分的发展之间的矛盾。 The principal contraction facing Chinese society in the new era is that between unbalanced and inadequate development and the people’s ever-growing needs for a better life. 004 党在新时代的强军目标是建设一支听党指挥、能打胜仗、作风优良的人民军队,把人民军队建设成为世界一流军队。 The Party’s goal of building a strong military in the new era is to build the people’s forces into world-class forces that obey the Party’s command, can fight and win, and maintain excellent conduct. 005 推动构建人类命运共同体 build a community with a shared future for mankind 006 近代以来久经磨难的中华民族迎来了从站起来、富起来到强起来的伟大飞跃。 The Chinese nation, which since modern times began had endured so much for so long, has achieved a tremendous transformation ——it has stood up, grown rich, and become strong. 007 行百里者半九十。 As the Chinese saying goes, the last leg of a journey just marks the halfway point. 008 敢于直面问题,敢于刮骨疗毒。 We must have the courage to face problems squarely, be braced for the pain. 009 不断增强党的政治领导力、思想领导力、群众组织力、社会号召力。 We must keep on strengthening the Party’s ability to lead politically, to guide through theory, to organize the people, and to inspire society. 010 保持政治定力,坚持实干兴邦。 We must maintain our political orientation, do the good solid work that sees our country thrive. 011 坚持党对一切工作的领导。 Ensuring Party leadership over all work. 012 坚持以人民为中心。 Committing to a people-centered approach. 013 坚持全面深化改革。 Continuing to comprehensively deepen reform.

金融英语考试常用词汇翻译

金融英语考试常用词汇翻译 1. 素质教育:Quality Education 2. EQ:分两种,一种为教育商数Educational quotient,另一种情感商数Emotional quotient 3. 保险业:the insurance industry 4. 保证重点指出:ensure funding for priority areas 5. 补发拖欠的养老金:clear up pension payments in arrears 6. 不良贷款:non-performing loan 7. 层层转包和违法分包:mutlti-level contracting and illegal subcontracting 8. 城乡信用社:credit cooperative in both urban and rural areas 9. 城镇居民最低生活保障:a minimum standard of living for city residents 10. 城镇职工医疗保障制度:the system of medical insurance for urban workers

11. 出口信贷:export credit 12. 贷款质量:loan quality 13. 贷款质量五级分类办法:the five-category assets classification for bank loans 14. 防范和化解金融风险:take precautions against and reduce financial risks 15. 防洪工程:flood-prevention project 16. 非法外汇交易:illegal foreign exchange transaction 17. 非贸易收汇:foreign exchange earnings through nontrade channels 18. 非银行金融机构:non-bank financial institutions 19. 费改税:transform administrative fees into taxes 20. 跟踪审计:foolow-up auditing 21. 工程监理制度:the monitoring system for projects

大数据时代的翻译理论与实践

大数据时代的翻译理论与实践 专业方向:翻译理论与实践班级:14级英语语言文学研究生 作者:张琦学号:81420379 摘要:此篇文献综经过查阅大量文章,总结了翻译理论与实践之现状与大数据时代对翻译理论与实践之影响。对于前者,即翻译理论与实践之现状,又包括当下翻译理论与实践现状:翻译理论与实践之关系;对于理论方面的现状包括翻译标准,翻译原则方面,以及翻译家翻译法之研究;而实践方面包括对翻译教学等方面研究。而大数据时代的来临,渐渐对翻译理论与实践的影响包括对翻译理论和翻译实践的影响。本文献综述通过此种分类方式,可以有效的探讨翻译理论与实践受大数据时代的影响。、关键字:翻译理论;实践;现状;大数据时代;影响 近年来对于翻译的研究从理论到对实践方面的研究。然而,随着大数据时代的来临,翻译理论与实践也在一定程度上受到影响。翻译的理论与实践也从传统信息受限单一的状态到数据信息共享状态,进而对翻译理论与实践方面产生一定的影响。 针对大数据对翻译理论与实践的影响,本人查阅中国知网等网站,万方数据等网站,发现涉及此的文献不在少数。 本文献综述从翻译理论与实践的现状开始探讨,将其从理论与实践方面分别进行探究。对于理论方面又继续细分,包括对翻译规则与翻译标准现状之探讨,以及对当代翻译家的翻译法与翻译特色的介绍;对于实践方面,本人从对翻译教学等方面介绍其现状。而后,引入大数据时代对翻译理论与实践的影响,也从对其理论与实践方面开始分析。 对于所查阅之文献,本人对其进行归纳整理,认为对大数据时代的翻译理论与实践的研究有所帮助。 一、翻译理论与实践的现状 在大数据时代影响之前,翻译理论与实践存在于闭塞状态,我们对翻译的理论与实践的探讨也比较狭隘。本文将从对翻译的理论与实践的现状进行介绍。先介绍翻译理论与实践的关系,然后对于理论部分,包括对翻译的标准、规则等方面的介绍;对于实践部分,包括对教学等方面的介绍。 理论与实践的关系

金融学考研专业英语词汇列表(经济类)

金融学考研专业英语词汇列表(经济类)1.air pocket气囊:指一种股票的显而易见的极其虚弱性。 2.backdoor listing后门上市:一家公司因其自身未能符合交易所上市规定,便买进一个上市公司,将自身并入其中而使自己能够上市。 3.basket purchase一篮子购买:以一种价格购买一组资产。然而在记帐同时,每件物品可以单独记入,并对每件资产指定一个成本。 4.bear trap空头陷阱:当股票下跌时,引起大量抛售,然后价格又上涨。 5.bed and breakfast deals床头和早餐交易:卖空骗局,个人或公司根据事先安排的交易,先卖出股票,继而在第二天买回,以此形成一个抵消资本收益的税损。本做法仅存于英国。 6.bottom fisher底部钓鱼人:寻找那些价格已跌至最底点,即将发生转机的商品或股票投资者。在有些情况指购买破产或濒临破产组织的股票或债券的人们。 7.butterfly spread蝴蝶差:同时在相同或不同的市场上买或卖三种期货合同,产生利润和借贷权。 8.Chinese Wall中国墙:不可逾越的障碍物,用以阻止华尔街商行的交易区 不公正地使用投资银行家们从客户那里秘密获得的信息。 9.fallen angle下坠天使:大公司的高价证券因某些不利的负面消息而使价格突然下跌。 10.golden handcuffs金手铐:将经纪人和经纪人事务所连结起来的合同;是经纪业对经纪人从一个公司到另一个公司频繁变动的反应。一般包括将其受雇时接受的大部分报酬返还原公司的协议。 11.gold brick假金砖:毫无价值的带有欺诈特点的证券。 12.gray knight灰骑士:公司收购中并非收购对象所寻求的投机性二次投标者,只想利用收购对象和原投标者之间的问题而牟利。 13.graveyard market墓地市场:一种在其中的不能出来,而在外面的不能进去的证券市场。 14.lame duck跛脚鸭:冒险失败的投机者或股票交易中资不抵债的人。

常用金融英语词汇的翻译

常用金融英语词汇的翻译 acquiring company 收购公司 bad loan 呆帐 chart of cash flow 现金流量表 clearly-established ownership 产权清晰 debt to equity 债转股 diversity of equities 股权多元化 economy of scale 规模经济 emerging economies 新兴经济 exchange-rate regime 汇率机制 fund and financing 筹资融资 global financial architecture 全球金融体系 global integration, globality 全球一体化,全球化 go public 上市 growth spurt (经济的)急剧增长 have one's "two commas" 百万富翁 hedge against 套期保值 housing mortgage 住房按揭 holdings 控股,所持股份 holding company 控股公司 initial offerings 原始股 initial public offerings 首次公募 innovative business 创新企业 intellectual capital 智力资本 inter-bank lending 拆借 internet customer 网上客户 investment payoff period 投资回收期 joint-stock 参股 mall rat 爱逛商店的年轻人 means of production 生产要素 (the)medical cost social pool for major diseases 大病医疗费用社会统筹mergers and acquisitions 并购 mobile-phone banking 移动电话银行业 moods 人气 net potato 网虫 non-store seling 直销 offering 新股 online-banking 网上银行业 online-finance 在线金融 online client (银行的)网上客户 paper profit 帐面收益 physical assets 有形资产 project fund system 项目资本金制度 pyramid sale 传销

大数据时代英文翻译

Era of Big Data is a woman's age; women in the gene can accumulate and deal with big data/ women are born to accumulate and deal with big data. Many men and children, in fact, have been wondering about this special ability of women. Like, as a child, just as soon as you entered the house your Mother said immediately in a suspicious tone: “Liu zhijun, you didn’t do well in the exam today, did you.” Another example, you just have a glance at the mobile phone, your wife laughs: “Does Er gou the next door ask you to play games?” One more ex ample, when you close the door and make a phone call, your girlfriend will cry: “Who are shot in bed?” They are sometimes right, sometimes wrong. However, On the whole, the accuracy rate is higher than chance level. When they are wrong, men would sneer women always give way to foolish fancies; when they are right, men would say women are sensitive animal maybe with more acute sensory organs. Anyway, that is a guess. It has already scared man that overall accuracy rate is higher than the random level. In order to adapt to this point, the male also developed a very strong skills against reconnaissance. This part is beyond the scope of this article, so no more details about it. Some studies, such as Hanna Holmes’s paper, have indi cated that the white matter of the female’s brain is higher than that of the male. So they have very strong imagination of connecting things together. Some recent studies have shown that women are better than men in the "date" memory. That is the reason why they are able to remember all the birthdays, anniversaries, and even some of the great day of unimportant friends. No matter whether these results are true or not, I am afraid that this is not women's most outstanding ability. Women's most remarkable ability is a long-term tracking of some seemingly unimportant data to form their own baseline and pattern. Once the patterns of these data points are significantly different from the baseline she is familiar with, she knows something unusual. In their daily life, women do not consider the difference between causality and correlation. They believe in the principle: "There must be something wrong out of something unusual." People who talk about big data often take Lin Biao as an example. Lin Biao recorded some detailed and unimportant data after a battle. Such as seized guns, the proportion of rifles and pistols, the age levels of war prisoners, seized grain, whether they are sorghum or millet, etc., all of which were unavoidably recorded in the book. Others laughed at him. But later, he determined where the enemy headquarters were according to these data. What women do is almost the same. A girl A has a secret crush on boy B, but she usually doesn’t contact him directly. Two days later, I asked her if she wanted to ask him to have dinner together. She said he was playing. I wondered “how do you know that?” She said that boy B usually is on the line Gmail at 8:00 am, away status at8:30am, for he goes out to buy coffee and breakfast, on line again at 9:00am, busy status, for he is at work, away again at12:30am for lunch, on line for whole evenings, maybe for reading or playing games. His buddy C is on line at10:00 am, still online till 2:00am next day. He is a boy who gets up late and stays up late. His buddy D is on line for the most of the day. However, the most important pattern is that there are 2-3 days per week, during which they would be offline or away for 3-4 hours together. Conclusion: they are playing together.

金融学毕业论文外文翻译中英文全

Improve the concept of financial supervision in rural areas1 Xun Qian Farmers in China's vast population, has some large-scale production of the farmers, but also survival-oriented farmers, huge differences between the financial needs of rural finance intermediation makes complex, together with agriculture itself is the profit low, natural and market risks high risk decision to weak agricultural industry characteristics, resulting in the cost of rural financial transactions is far higher than the city, also decided to organize the rural financial system in terms of operation or in the market has its own special characteristics. 20 years of financial reform, financial development while the Chinese city made impressive achievements, but the rural finance is the entire financial system is still the weakest link. Insufficient supply of rural finance, competition is not sufficient, farmers and agricultural enterprises in getting loans and other issues is also very prominent, backward rural financial system can no longer effectively support the development of modern agriculture or the transformation of traditional agriculture and the building of new socialist countryside, which to improve the rural financial supervision new topic. China's rural financial regulatory problems (A) the formation of China's financial regulatory system had "a line three commission " (People's Bank, the Securities Regulatory Commission, Insurance Regulatory Commission and the Banking Regulatory Commission) financial regulatory structure. Bank These stringent requirements, different management and diversification of monitoring has its positive role, but it also had some negative effects. First, inefficient supervision, supervision of internal consumption of high costs, limited financial industry business development and innovation space. Second, the regulatory agencies, regulatory bodies and the information asymmetry between central banks, banking, securities, and insurance mechanisms of coordination between regulatory bodies are not perfect. Information between central banks and regulatory agencies is difficult to share, is difficult to create effective monitoring force. Basically between the various 1American Journal of Agricultural Economics,2009.

相关主题