开源 企业版 高校版 私有云 模力方舟 AI 队友
代码拉取完成,页面将自动刷新
开源项目 > 服务器应用 > 分布式服务/框架 &&
捐赠
捐赠前请先登录
扫描微信二维码支付
取消
支付完成
支付提示
将跳转至支付宝完成支付
确定
取消
6 Star 9 Fork 9

Apache/beam

加入 Gitee
与超过 1400万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
已有帐号? 立即登录
文件
master
分支 (2645)
标签 (473)
master
pr-bot-state
dependabot/pip/sdks/python/container/ml/py312/transformers-5.3.0
asf-site
nightly-refs/heads/master
bumpjava21
dependabot/pip/sdks/python/container/ml/py313/torch-2.12.0
dependabot/pip/sdks/python/container/ml/py311/cryptography-48.0.1
dependabot/pip/sdks/python/container/py312/cryptography-48.0.1
dependabot/pip/sdks/python/container/ml/py310/torch-2.12.0
20260624_zizmorInitial
20260609_spotlessVersion
website-2-75
validate-275-rc1
resolution-too-deep
Abacn-patch-1
updates_managed_io_docs_2.75.0_rc1
20260527_createBOMWorkflow
20260630_fixCodeCompletionFlakeyTest
feature/adk-local-model
sdks/v2.75.0-RC1
v2.75.0-RC1
v2.75.0-RC00
v2.74.0
sdks/v2.74.0
sdks/v2.74.0-RC3
v2.74.0-RC3
v2.74.0-RC2
sdks/v2.74.0-RC2
sdks/v2.74.0-RC1
v2.74.0-RC1
v2.74.0-RC00
v2.73.0
sdks/v2.73.0
danny-tag
sdks/v2.73.0-RC2
v2.73.0-RC2
sdks/v2.73.0-RC1
v2.73.0-RC1
v2.73.0-RC00
master
分支 (2645)
标签 (473)
master
pr-bot-state
dependabot/pip/sdks/python/container/ml/py312/transformers-5.3.0
asf-site
nightly-refs/heads/master
bumpjava21
dependabot/pip/sdks/python/container/ml/py313/torch-2.12.0
dependabot/pip/sdks/python/container/ml/py311/cryptography-48.0.1
dependabot/pip/sdks/python/container/py312/cryptography-48.0.1
dependabot/pip/sdks/python/container/ml/py310/torch-2.12.0
20260624_zizmorInitial
20260609_spotlessVersion
website-2-75
validate-275-rc1
resolution-too-deep
Abacn-patch-1
updates_managed_io_docs_2.75.0_rc1
20260527_createBOMWorkflow
20260630_fixCodeCompletionFlakeyTest
feature/adk-local-model
sdks/v2.75.0-RC1
v2.75.0-RC1
v2.75.0-RC00
v2.74.0
sdks/v2.74.0
sdks/v2.74.0-RC3
v2.74.0-RC3
v2.74.0-RC2
sdks/v2.74.0-RC2
sdks/v2.74.0-RC1
v2.74.0-RC1
v2.74.0-RC00
v2.73.0
sdks/v2.73.0
danny-tag
sdks/v2.73.0-RC2
v2.73.0-RC2
sdks/v2.73.0-RC1
v2.73.0-RC1
v2.73.0-RC00
克隆/下载
克隆/下载
提示
下载代码请复制以下命令到终端执行
为确保你提交的代码身份被 Gitee 正确识别,请执行以下命令完成配置
初次使用 SSH 协议进行代码克隆、推送等操作时,需按下述提示完成 SSH 配置
1 生成 RSA 密钥
2 获取 RSA 公钥内容,并配置到 SSH公钥
在 Gitee 上使用 SVN,请访问 使用指南
使用 HTTPS 协议时,命令行会出现如下账号密码验证步骤。基于安全考虑,Gitee 建议 配置并使用私人令牌 替代登录密码进行克隆、推送等操作
Username for 'https://gitee.com': userName
Password for 'https://userName@gitee.com': # 私人令牌
master
分支 (2645)
标签 (473)
master
pr-bot-state
dependabot/pip/sdks/python/container/ml/py312/transformers-5.3.0
asf-site
nightly-refs/heads/master
bumpjava21
dependabot/pip/sdks/python/container/ml/py313/torch-2.12.0
dependabot/pip/sdks/python/container/ml/py311/cryptography-48.0.1
dependabot/pip/sdks/python/container/py312/cryptography-48.0.1
dependabot/pip/sdks/python/container/ml/py310/torch-2.12.0
20260624_zizmorInitial
20260609_spotlessVersion
website-2-75
validate-275-rc1
resolution-too-deep
Abacn-patch-1
updates_managed_io_docs_2.75.0_rc1
20260527_createBOMWorkflow
20260630_fixCodeCompletionFlakeyTest
feature/adk-local-model
sdks/v2.75.0-RC1
v2.75.0-RC1
v2.75.0-RC00
v2.74.0
sdks/v2.74.0
sdks/v2.74.0-RC3
v2.74.0-RC3
v2.74.0-RC2
sdks/v2.74.0-RC2
sdks/v2.74.0-RC1
v2.74.0-RC1
v2.74.0-RC00
v2.73.0
sdks/v2.73.0
danny-tag
sdks/v2.73.0-RC2
v2.73.0-RC2
sdks/v2.73.0-RC1
v2.73.0-RC1
v2.73.0-RC00
addprefix.py 3.29 KB
一键复制 编辑 原始数据 按行查看 历史
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
"""A Python multi-language pipeline that adds prefixes to a given set of strings.
This pipeline reads an input text file and adds two prefixes to every line read from the file.
* Prefix 'java'. This is added by a multi-language Java transform named 'JavaPrefix'.
* Prefix 'python'. This is added by a Python transform.
Example commands for executing the program:
DirectRunner:
$ python addprefix.py --runner DirectRunner --environment_type=DOCKER --input <INPUT FILE> --output output --expansion_service_port <PORT>
DataflowRunner:
$ python addprefix.py \
--runner DataflowRunner \
--temp_location $TEMP_LOCATION \
--project $GCP_PROJECT \
--region $GCP_REGION \
--job_name $JOB_NAME \
--num_workers $NUM_WORKERS \
--input "gs://dataflow-samples/shakespeare/kinglear.txt" \
--output "gs://$GCS_BUCKET/javaprefix/output" \
--expansion_service_port <PORT>
"""
import logging
import re
import typing
import apache_beam as beam
from apache_beam.io import ReadFromText
from apache_beam.io import WriteToText
from apache_beam.transforms.external import ImplicitSchemaPayloadBuilder
from apache_beam.options.pipeline_options import PipelineOptions
def run(input_path, output_path, expansion_service_port, pipeline_args):
pipeline_options = PipelineOptions(pipeline_args)
with beam.Pipeline(options=pipeline_options) as p:
input = p | 'Read' >> ReadFromText(input_path).with_output_types(str)
java_output = (
input
| 'JavaPrefix' >> beam.ExternalTransform(
'beam:transform:org.apache.beam:javaprefix:v1',
ImplicitSchemaPayloadBuilder({'prefix': 'java:'}),
('localhost:%s' % expansion_service_port)))
def python_prefix(record):
return 'python:%s' % record
output = java_output | 'PythonPrefix' >> beam.Map(python_prefix)
output | 'Write' >> WriteToText(output_path)
if __name__ == '__main__':
logging.getLogger().setLevel(logging.INFO)
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
'--input',
dest='input',
required=True,
help='Input file')
parser.add_argument(
'--output',
dest='output',
required=True,
help='Output file')
parser.add_argument(
'--expansion_service_port',
dest='expansion_service_port',
required=True,
help='Expansion service port')
known_args, pipeline_args = parser.parse_known_args()
run(
known_args.input,
known_args.output,
known_args.expansion_service_port,
pipeline_args)
Loading...
举报
举报成功
我们将于2个工作日内通过站内信反馈结果给你!
请认真填写举报原因,尽可能描述详细。
请选择举报类型
取消
发送
误判申诉

此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。

如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。

取消
提交

简介

Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends, including Apache Flink, Apache Spark, Google Cloud Dataflow and Hazelcast Jet.
取消

发行版

暂无发行版

开源评估指数源自 OSS-Compass 评估体系,评估体系围绕以下三个维度对项目展开评估:

1. 开源生态

  • 生产力:来评估开源项目输出软件制品和开源价值的能力。
  • 创新力:用于评估开源软件及其生态系统的多样化程度。
  • 稳健性:用于评估开源项目面对多变的发展环境,抵御内外干扰并自我恢复的能力。

2. 协作、人、软件

  • 协作:代表了开源开发行为中协作的程度和深度。
  • 人:观察开源项目核心人员在开源项目中的影响力,并通过第三方视角考察用户和开发者对开源项目的评价。
  • 软件:从开源项目对外输出的制品评估其价值最终落脚点。也是开源评估最"古老"的主流方向之一"开源软件" 的具体表现。

3. 评估模型

    基于"开源生态"与"协作、人、软件"的维度,找到与该目标直接或间接相关的可量化指标,对开源项目健康与生态进行量化评估,最终形成开源评估指数。

贡献者

全部

近期动态

不能加载更多了
编辑仓库简介
简介内容
主页
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Java
1
https://gitee.com/apache/beam.git
git@gitee.com:apache/beam.git
apache
beam
beam
master
点此查找更多帮助

搜索帮助

评论
仓库举报
回到顶部
登录提示
该操作需登录 Gitee 帐号,请先登录后再操作。
立即登录
没有帐号,去注册

AltStyle によって変換されたページ (->オリジナル) /