Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Migrate cassie.py from deprecated pycassa to cassandra-driver ORM #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
Copilot wants to merge 21 commits into main
base: main
Choose a base branch
Loading
from copilot/remove-pycassa-from-cassie
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
38efdd6
Initial plan
Copilot Nov 28, 2025
15fe201
Replace pycassa with cassandra ORM in cassie.py and add missing models
Copilot Nov 28, 2025
0967b30
Address code review feedback: fix imports, simplify sorting, improve ...
Copilot Nov 28, 2025
c32a654
Fix get_as_dict methods to be classmethods with proper cls parameter
Copilot Nov 28, 2025
bf64fe5
Fix type handling in get_package_new_buckets for Ascii/Text fields
Copilot Nov 28, 2025
b7adbe6
Improve readability in record_bug_for_bucket by simplifying column1 e...
Copilot Nov 28, 2025
b122239
Fix dictionary-style access to use dot notation in get_as_dict methods
Copilot Nov 28, 2025
c281c5c
Add example scripts for all migrated cassie.py functions
Copilot Nov 28, 2025
49a4dd0
Add setup_cassandra() call to all example scripts for easier use
Copilot Nov 28, 2025
c5f80ff
cassie: don't call 'cassandra_session' at module import time
Hyask Dec 2, 2025
f29d516
daisy: remove the counter updates
Hyask Dec 17, 2025
eb622f1
errortracker: fix cassandra schema
Hyask Dec 2, 2025
f52dd13
cassie: formatting pass
Hyask Dec 2, 2025
9523fcf
cassie: remove the use of OrderedDict, dict are ordered by default now
Hyask Dec 2, 2025
6e59799
oopses: try to make use of the 'Date' field of a crash
Hyask Dec 19, 2025
0130db1
examples: default to using Noble, for more up-to-date data
Hyask Dec 2, 2025
abbebb1
cassandra_schema: document columns
Hyask Dec 19, 2025
a3c8b39
cassie: manual tests and fixes against production data
Hyask Dec 19, 2025
032d6b8
tests: introduce testing of cassie
Hyask Dec 19, 2025
ab360ab
Add comprehensive tests for get_package_crash_rate covering different...
Copilot Dec 17, 2025
35b4cd2
tests: speed up tests by having cassandra fixtures be 'class' scoped
Hyask Dec 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions examples/cassie_functions/README.md
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Cassie Functions - Example Usage Scripts

This directory contains minimal example scripts demonstrating how to call each function that was migrated from `pycassa` to the `cassandra` ORM in `src/errors/cassie.py`.

## Purpose

These scripts provide:
- Clear examples of function signatures and parameters
- Sample input data for each function
- Basic usage patterns

## Important Notes

⚠️ **These are example scripts only** - They demonstrate the API but won't run successfully without:
- A properly configured Cassandra database connection (configured via `errortracker.config`)
- Valid data in the database
- Required dependencies installed (cassandra-driver, numpy, etc.)

Each script includes a call to `setup_cassandra()` which initializes the Cassandra connection before using any functions. This function:
- Sets up the database connection using credentials from the configuration
- Synchronizes the database schema
- Ensures the connection is ready for queries

## Structure

Each file corresponds to one function in `cassie.py`:
- `get_total_buckets_by_day.py` - Example for `get_total_buckets_by_day()`
- `get_bucket_counts.py` - Example for `get_bucket_counts()`
- `get_crashes_for_bucket.py` - Example for `get_crashes_for_bucket()`
- And so on...

## Usage

To understand how to use a specific function:

1. Open the corresponding `.py` file
2. Review the function call with example parameters
3. Adapt the parameters to your use case

Example:
```bash
# View the example (won't execute without DB connection)
cat get_bucket_counts.py
```

## Functions Included

All functions migrated from pycassa to cassandra ORM:

### Bucket Operations
- `get_total_buckets_by_day` - Get bucket counts by day
- `get_bucket_counts` - Get bucket counts with filtering
- `get_crashes_for_bucket` - Get crashes for a specific bucket
- `get_package_for_bucket` - Get package info for bucket
- `get_metadata_for_bucket` - Get metadata for bucket
- `get_metadata_for_buckets` - Get metadata for multiple buckets
- `get_versions_for_bucket` - Get versions for bucket
- `get_source_package_for_bucket` - Get source package
- `get_retrace_failure_for_bucket` - Get retrace failure info
- `get_traceback_for_bucket` - Get traceback for bucket
- `get_stacktrace_for_bucket` - Get stacktrace for bucket
- `bucket_exists` - Check if bucket exists

### Crash Operations
- `get_crash` - Get crash details
- `get_crash_count` - Get crash counts over time
- `get_user_crashes` - Get crashes for a user
- `get_average_crashes` - Get average crashes per user
- `get_average_instances` - Get average instances for bucket

### Package Operations
- `get_package_crash_rate` - Analyze package crash rates
- `get_package_new_buckets` - Get new buckets for package version
- `get_binary_packages_for_user` - Get user's packages

### Retracer Operations
- `get_retracer_count` - Get retracer count for date
- `get_retracer_counts` - Get retracer counts over time
- `get_retracer_means` - Get mean retracing times

### Bug/Signature Operations
- `record_bug_for_bucket` - Record a bug for bucket
- `get_signatures_for_bug` - Get signatures for bug
- `get_problem_for_hash` - Get problem for hash

### System Image Operations
- `get_system_image_versions` - Get system image versions

## Migration Notes

These functions were migrated from the deprecated `pycassa` library to the modern `cassandra-driver` ORM while maintaining backward compatibility.
17 changes: 17 additions & 0 deletions examples/cassie_functions/bucket_exists.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env python3
"""Example usage of bucket_exists function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import bucket_exists

# Setup Cassandra connection
setup_cassandra()

# Example: Check if a bucket exists
bucketid = "/bin/zsh:11:makezleparams:execzlefunc:redrawhook:zlecore:zleread"

exists = bucket_exists(bucketid)
print(f"Bucket {bucketid} exists: {exists}")
21 changes: 21 additions & 0 deletions examples/cassie_functions/get_average_crashes.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/usr/bin/env python3
"""Example usage of get_average_crashes function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_average_crashes

# Setup Cassandra connection
setup_cassandra()

# Example: Get average crashes per user
field = "zsh:5.9-6ubuntu2"
release = "Ubuntu 24.04"
days = 14

data = get_average_crashes(field, release, days=days)
print(f"Average crash data: {data}")
for timestamp, avg in data:
print(f"Timestamp: {timestamp}, Average: {avg}")
19 changes: 19 additions & 0 deletions examples/cassie_functions/get_average_instances.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env python3
"""Example usage of get_average_instances function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_average_instances

# Setup Cassandra connection
setup_cassandra()

# Example: Get average instances for a bucket
bucketid = "/bin/zsh:11:makezleparams:execzlefunc:redrawhook:zlecore:zleread"
release = "Ubuntu 24.04"
days = 7

for timestamp, avg in get_average_instances(bucketid, release, days=days):
print(f"Timestamp: {timestamp}, Average: {avg}")
23 changes: 23 additions & 0 deletions examples/cassie_functions/get_binary_packages_for_user.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env python3
"""Example usage of get_binary_packages_for_user function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_binary_packages_for_user

# Setup Cassandra connection
setup_cassandra()

# Example: Get binary packages for a user
user = "foundations-bugs" # quite slow (~1m56s)
user = "xubuntu-bugs" # way faster (~12s)

packages = get_binary_packages_for_user(user)
if packages:
print(f"Found {len(packages)} packages")
for package in packages:
print(f"Package: {package}")
else:
print("No packages found")
51 changes: 51 additions & 0 deletions examples/cassie_functions/get_bucket_counts.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#!/usr/bin/env python3
"""Example usage of get_bucket_counts function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_bucket_counts

# Setup Cassandra connection
setup_cassandra()

# Example: Get bucket counts for Ubuntu 24.04 today
print("Ubuntu 24.04 - today")
result = get_bucket_counts(
release="Ubuntu 24.04",
period="today"
)

print(f"Found {len(result)} buckets")
for bucket, count in result[:30]:
print(f"Bucket: {bucket}, Count: {count}")
# Example: Get bucket counts for Ubuntu 24.04 today

print("Past week")
result = get_bucket_counts(
period="week"
)

print(f"Found {len(result)} buckets")
for bucket, count in result[:30]:
print(f"Bucket: {bucket}, Count: {count}")

print("Past month")
result = get_bucket_counts(
period="month"
)

print(f"Found {len(result)} buckets")
for bucket, count in result[:30]:
print(f"Bucket: {bucket}, Count: {count}")

print("Nautilus package - today")
result = get_bucket_counts(
period="today",
package="nautilus",
)

print(f"Found {len(result)} buckets")
for bucket, count in result[:30]:
print(f"Bucket: {bucket}, Count: {count}")
18 changes: 18 additions & 0 deletions examples/cassie_functions/get_crash.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env python3
"""Example usage of get_crash function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_crash

# Setup Cassandra connection
setup_cassandra()

# Example: Get crash details
oopsid = "e3855456-cecb-11f0-b91f-fa163ec44ecd"
columns = ["Package", "StacktraceAddressSignature"]

crash_data = get_crash(oopsid, columns=columns)
print(f"Crash data: {crash_data}")
22 changes: 22 additions & 0 deletions examples/cassie_functions/get_crash_count.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/bin/env python3
"""Example usage of get_crash_count function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_crash_count

# Setup Cassandra connection
setup_cassandra()

# Example: Get crash count for Ubuntu 24.04
start = 3
finish = 10
release = "Ubuntu 24.04"

for date, count in get_crash_count(start, finish, release=release):
print(f"Date: {date}, Release: {release}, Crashes: {count}")

for date, count in get_crash_count(start, finish):
print(f"Date: {date}, Crashes: {count}")
26 changes: 26 additions & 0 deletions examples/cassie_functions/get_crashes_for_bucket.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env python3
"""Example usage of get_crashes_for_bucket function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_crashes_for_bucket

# Setup Cassandra connection
setup_cassandra()

# Example: Get crashes for a specific bucket
bucketid = "/bin/zsh:11:makezleparams:execzlefunc:redrawhook:zlecore:zleread"
limit = 10

crashes = get_crashes_for_bucket(bucketid, limit=limit)
print(f"Found {len(crashes)} crashes")
for crash in crashes:
print(f"Crash ID: {crash}")

start_uuid = "cbb0a4b6-d120-11f0-a9ed-fa163ec8ca8c"
crashes = get_crashes_for_bucket(bucketid, limit=limit, start=start_uuid)
print(f"Found {len(crashes)} crashes (started at {start_uuid})")
for crash in crashes:
print(f"Crash ID: {crash}")
18 changes: 18 additions & 0 deletions examples/cassie_functions/get_metadata_for_bucket.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env python3
"""Example usage of get_metadata_for_bucket function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_metadata_for_bucket

# Setup Cassandra connection
setup_cassandra()

# Example: Get metadata for a specific bucket
bucketid = "/bin/zsh:11:makezleparams:execzlefunc:redrawhook:zlecore:zleread"
release = "Ubuntu 24.04"

metadata = get_metadata_for_bucket(bucketid, release=release)
print(f"Metadata: {metadata}")
19 changes: 19 additions & 0 deletions examples/cassie_functions/get_metadata_for_buckets.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env python3
"""Example usage of get_metadata_for_buckets function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_metadata_for_buckets

# Setup Cassandra connection
setup_cassandra()

# Example: Get metadata for multiple buckets
bucketids = ["bucket_1", "bucket_2", "bucket_3"]
release = "Ubuntu 24.04"

metadata_dict = get_metadata_for_buckets(bucketids, release=release)
for bucketid, metadata in metadata_dict.items():
print(f"Bucket {bucketid}: {metadata}")
26 changes: 26 additions & 0 deletions examples/cassie_functions/get_package_crash_rate.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env python3
"""Example usage of get_package_crash_rate function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_package_crash_rate

# Setup Cassandra connection
setup_cassandra()

# Example: Get crash rate for a package update
release = "Ubuntu 24.04"
src_package = "firefox"
old_version = "120.0"
new_version = "121.0"
pup = 100 # Phased update percentage
date = "20231115"
absolute_uri = "https://errors.ubuntu.com"

result = get_package_crash_rate(
release, src_package, old_version, new_version,
pup, date, absolute_uri, exclude_proposed=False
)
print(f"Crash rate analysis: {result}")
18 changes: 18 additions & 0 deletions examples/cassie_functions/get_package_for_bucket.py
View file Open in desktop
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env python3
"""Example usage of get_package_for_bucket function."""

import sys
sys.path.insert(0, '../../src')

from errortracker.cassandra import setup_cassandra
from errors.cassie import get_package_for_bucket

# Setup Cassandra connection
setup_cassandra()

# Example: Get package information for a bucket
bucketid = "example_bucket_id_12345"

package, version = get_package_for_bucket(bucketid)
print(f"Package: {package}")
print(f"Version: {version}")
Loading
Loading

AltStyle によって変換されたページ (->オリジナル) /