Google Cloud Discovery Engine V1 Client - Class GcsTrainingInput (1.7.0)

Reference documentation and code samples for the Google Cloud Discovery Engine V1 Client class GcsTrainingInput.

Cloud Storage training data input.

Generated from protobuf message google.cloud.discoveryengine.v1.TrainCustomModelRequest.GcsTrainingInput

Namespace

Google \ Cloud \ DiscoveryEngine \ V1 \ TrainCustomModelRequest

Methods

__construct

Constructor.

Parameters
Name Description
data array

Optional. Data for populating the Message object.

↳ corpus_data_path string

The Cloud Storage corpus data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

↳ query_data_path string

The gcs query data which could be associated in train data. The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

↳ train_data_path string

Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number). For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example: * * query-id\tcorpus-id\tscore * * query1\tdoc1\t1

↳ test_data_path string

Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

getCorpusDataPath

The Cloud Storage corpus data which could be associated in train data.

The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

Returns
Type Description
string

setCorpusDataPath

The Cloud Storage corpus data which could be associated in train data.

The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id, title and text. Example: {"_id": "doc1", title: "relevant doc", "text": "relevant text"}

Parameter
Name Description
var string
Returns
Type Description
$this

getQueryDataPath

The gcs query data which could be associated in train data.

The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

Returns
Type Description
string

setQueryDataPath

The gcs query data which could be associated in train data.

The data path format is gs://<bucket_to_data>/<jsonl_file_name>. A newline delimited jsonl/ndjson file. For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}

Parameter
Name Description
var string
Returns
Type Description
$this

getTrainDataPath

Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:

  • query-id\tcorpus-id\tscore
  • query1\tdoc1\t1
Returns
Type Description
string

setTrainDataPath

Cloud Storage training data path whose format should be gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv format. Each line should have the doc_id and query_id and score (number).

For search-tuning model, it should have the query-id corpus-id score as tsv file header. The score should be a number in [0, inf+). The larger the number is, the more relevant the pair is. Example:

  • query-id\tcorpus-id\tscore
  • query1\tdoc1\t1
Parameter
Name Description
var string
Returns
Type Description
$this

getTestDataPath

Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

Returns
Type Description
string

setTestDataPath

Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.

Parameter
Name Description
var string
Returns
Type Description
$this

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年10月30日 UTC.