Official implementation of ACL 2025 oral paper: "Conditional Dichotomy Quantification via Geometric Embedding"
@inproceedings{cui-etal-2025-conditional, title = "Conditional Dichotomy Quantification via Geometric Embedding", author = "Cui, Shaobo and Liu, Wenqing and Feng, Yiyang and Zhou, Jiawei and Faltings, Boi", editor = "Che, Wanxiang and Nabende, Joyce and Shutova, Ekaterina and Pilehvar, Mohammad Taher", booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2025", address = "Vienna, Austria", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.acl-long.383/", doi = "10.18653/v1/2025.acl-long.383", pages = "7765--7791", ISBN = "979-8-89176-251-0" }
A lightweight toolkit for measuring how "opposite" two texts are when they share the same context.
Software: PyPI version
Pretrained Models: debate-bert defeasibleNLI-bert causal-reasoning-bert
- β¨ Why should I care?
- π¦ Installation
- ποΈ Supportive Datasets (Quick Overview)
- Opposite-Score
- π‘ Responsible Usage
- π Leaderboard: Dichotomy Quantification
Powered by our Opposite-Score embeddings and three rigorously curated datasets (Debate βͺοΈ Defeasible NLI βͺοΈ Causal Reasoning).
conda create -n dichotomy python=3.10
conda activate dichotomy
## opposite-score is our implemented package: https://pypi.org/project/opposite-score/
pip install opposite-score
Scenario | Train | Val | Test | Total | Avg len (ctx) | Avg len (pos/neg/neu) |
---|---|---|---|---|---|---|
Debate | 58 k | 21 k | 16 k | 95 k | 8.8 | 11.6 / 11.5 / 11.2 |
Defeasible NLI | 8 k | 8 k | 424 k | 441 k | 23.1 | 8.5 / 8.3 / 8.4 |
Causal Reasoning | 14 k | 18 k | 16 k | 48 k | 21.0 | 8.4 / 10.1 / 9.1 |
Figure 1. Sentence-length distributions for contexts, positive, negative, and neutral arguments across datasets.
Why it matters
Balanced lengths & human-verified neutrals stop models from "cheating" on superficial cues and keep the focus on genuine oppositional content.
Efficient embeddings and scoring mechanism for detecting contrasting or opposite relationships in text, based on a given context.
Opposite-Score is designed to generate embeddings and compute the opposite-score, which quantifies the degree of contrast or opposition between two textual outputs within the same context. This package is particularly useful in scenarios like debates, legal reasoning, and causal analysis where contrasting perspectives need to be evaluated based on shared input.
- Opposite-Score Calculation: Computes a numerical score representing how opposite two texts are, conditioned on a shared context.
- Opposite-Aware Embeddings: Generates embeddings optimized for contrasting textual relationships.
- Easy to Use: Only a few lines of code to get sentence/token-level embeddings and calculate opposite scores.
- Automatic Model Download: The first initialization automatically downloads and installs the pre-trained Opposite-Score model.
Install Opposite-Score via pip:
pip install opposite-score==0.0.1
from oppositescore.model.dichotomye import DichotomyE # Example inputs context = ["A company launches a revolutionary product."] sentence1 = ["Competitors quickly release similar products, reducing the company's advantage."] sentence2 = ["The company gains a significant advantage due to its unique product."] # Initialize the model opposite_scorer = DichotomyE.from_pretrained('shaobocui/opposite-score-debate-bert', pooling_strategy='cls').cuda() # Calculate opposite-score (using cosine similarity as an example) opposite_score = opposite_scorer.calculate_opposite_score(ctx=context, sent1=sentence1, sent2=sentence2) print('Opposite Score:', opposite_score) # Output: Opposite Score: 0.11086939
This software is released for research and educational purposes only. It is intended to support studies on argument contrast, causal reasoning, and sentence embeddings.
Please ensure proper attribution when using the code, models, or datasets in publications or derivative work. Commercial use are expected to contact authors for explicit permission.
For questions or collaborations, feel free to contact the authors.
Model | Debate (DCF β) | Debate (Angle β) | NLI (DCF β) | NLI (Angle β) | Causal (DCF β) | Causal (Angle β) |
---|---|---|---|---|---|---|
InferSent-GloVe | 36.19 | 1.58 | 23.11 | 0.39 | 26.71 | 0.44 |
InferSent-fastText | 42.02 | 4.56 | 27.66 | 1.42 | 32.36 | 1.44 |
USE | 16.53 | 3.31 | 18.07 | 1.01 | 13.54 | 0.46 |
BERT | 31.37 | 0.18 | 11.99 | 0.25 | 27.17 | 0.26 |
CoSENT | 38.49 | 0.64 | 26.86 | 0.28 | 30.07 | 0.14 |
SBERT | 31.61 | 1.50 | 22.89 | 0.64 | 22.68 | 0.43 |
SimCSE (BERT) | 30.59 | 2.78 | 13.91 | 0.93 | 25.15 | 1.30 |
AoE (BERT) | 26.27 | 0.48 | 24.02 | 0.10 | 30.09 | 0.11 |
RoBERTa | 43.61 | 0.00 | 12.60 | 0.00 | 24.06 | 0.00 |
SimCSE (RoBERTa) | 30.84 | 2.42 | 12.78 | 0.64 | 27.01 | 1.28 |
LLaMA-2 (7B) | 30.46 | 16.99 | 21.25 | 8.65 | 32.80 | 5.67 |
LLaMA-2 (13B) | 47.42 | 11.24 | 30.27 | 4.59 | 34.05 | 2.56 |
AoE (7B) | 38.92 | 14.85 | 20.01 | 8.22 | 27.20 | 4.03 |
AoE (13B) | 44.88 | 9.73 | 28.72 | 3.20 | 30.89 | 1.58 |
LLaMA-3.1 (8B) | 39.81 | 10.86 | 21.70 | 5.33 | 26.38 | 2.67 |
LLaMA-3.1 (70B) | 34.47 | 13.74 | 15.95 | 6.83 | 25.23 | 3.84 |
Ours (BERT) | 46.97 | 30.66 | 41.72 | 3.25 | 67.59 | 20.69 |
Ours (RoBERTa) | 55.93 | 83.67 | 47.27 | 0.63 | 76.55 | 5.06 |
π Both metrics benefit from higher values: better classification and stronger geometric contrast.