Google Cloud Discovery Engine V1 Client - Class GcsTrainingInput (1.7.0)
Stay organized with collections
Save and categorize content based on your preferences.
Reference documentation and code samples for the Google Cloud Discovery Engine V1 Client class GcsTrainingInput.
Cloud Storage training data input.
Generated from protobuf message google.cloud.discoveryengine.v1.TrainCustomModelRequest.GcsTrainingInput
Namespace
Google \ Cloud \ DiscoveryEngine \ V1 \ TrainCustomModelRequestMethods
__construct
Constructor.
| Parameters | |
|---|---|
| Name | Description |
data |
array
Optional. Data for populating the Message object. |
↳ corpus_data_path |
string
The Cloud Storage corpus data which could be associated in train data. The data path format is |
↳ query_data_path |
string
The gcs query data which could be associated in train data. The data path format is |
↳ train_data_path |
string
Cloud Storage training data path whose format should be |
↳ test_data_path |
string
Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path. |
getCorpusDataPath
The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
| Returns | |
|---|---|
| Type | Description |
string |
|
setCorpusDataPath
The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
| Parameter | |
|---|---|
| Name | Description |
var |
string
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getQueryDataPath
The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
| Returns | |
|---|---|
| Type | Description |
string |
|
setQueryDataPath
The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id
and text. Example: {"_id": "query1", "text": "example query"}
| Parameter | |
|---|---|
| Name | Description |
var |
string
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getTrainDataPath
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscorequery1\tdoc1\t1
| Returns | |
|---|---|
| Type | Description |
string |
|
setTrainDataPath
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+).
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscorequery1\tdoc1\t1
| Parameter | |
|---|---|
| Name | Description |
var |
string
|
| Returns | |
|---|---|
| Type | Description |
$this |
|
getTestDataPath
Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
| Returns | |
|---|---|
| Type | Description |
string |
|
setTestDataPath
Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
| Parameter | |
|---|---|
| Name | Description |
var |
string
|
| Returns | |
|---|---|
| Type | Description |
$this |
|