Trelis/touch-rugby-modernbert-pairs
Viewer • Updated • 305 • 12
How to use dujun/modernbert-embed-base-dj-ft-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("dujun/modernbert-embed-base-dj-ft-v2")
sentences = [
"Besides penalties, what other consequences might result from continuous rule breaches?",
"20 Misconduct \n20.1\tMisconduct warranting Penalty, Forced Interchange, Sin Bin or Dismissal \nincludes:\n20.1.1\tContinuous or regular breaches of the Rules;\n20.1.2\tSwearing towards another player, Referee, spectator or other match \t\nofficial;\n20.1.3\tDisputing decisions of Referees or other match official(s);\n20.1.4\tUsing more than the necessary physical force to make a Touch;\n20.1.5\tPoor sportsmanship;\n20.1.6\tTripping, striking, or otherwise assaulting another player, Referee, \nspectator or other match official; or\n20.1.7\tAny other action that is contrary to the spirit of the game.21 Forced Interchange \n21.1\tWhere the Referee deems it necessary to implement a Forced Interchange \nfollowing an Infringement, the Referee is to stop the match, direct the ball to \nbe placed on the Mark, advise the offending player of the reason for the Forced \nInterchange, direct that player to return to the Interchange Area, display the \nrelevant signal and award a Penalty to the non-offending Team.",
"Line \nMarkings are to be laid out as shown in Appendix 1 - The Field of Play.Sidelines \nextend seven (7) metres beyond the Try Lines to join the Dead Ball Lines and \ndefine the In-Goal Areas which measure fifty (50) metres wide by seven (7) \nmetres in length.1.3\tThe Interchange Areas are located no closer than one (1) metre from each \nSideline.1.4\tSuitably sized markers, cones or corner posts of a distinguishing colour and \nmade from safe and pliable material should be positioned at the intersections of \nthe Sideline and Halfway line and the Sideline and the Try Line.1.4.1\tMarkers, cones or corner posts placed on the junction of the Sideline and \nTry Line are deemed to be in the Field of Play.1.4.2\tAll other markers or cones are deemed to be out of the Field of Play.1.5\tThe standard playing surface is grass.Other surfaces including synthetic grass \nmay be used but shall be subject to NTA approved standards.",
"The ball may be passed, knocked or handed between players \nof the Attacking Team who may in turn run or otherwise move with the ball in an \nattempt to gain territorial Advantage and to score Tries.Defending players prevent \nthe Attacking Team from gaining a territorial Advantage by touching the ball carrier.1 The Field of Play \n \n1.1\tThe Field of Play is rectangular in shape measuring 70 metres in length from \nTry Line to Try Line, excluding the In-Goal Areas and 50 metres in width from \nSideline to Sideline excluding the Interchange Areas.1.1.1\tVariations to the dimensions of the Field of Play may be made but must be \nincluded in relevant competition, event or tournament conditions\n1.2\tLine Markings should be 4cm in width but must be no less than 2.5cm.Line \nMarkings are to be laid out as shown in Appendix 1 - The Field of Play."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the touch-rugby-modernbert-pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("dujun/modernbert-embed-base-dj-ft-v2")
# Run inference
sentences = [
'When does a player cease to be the Half?',
'13.10\tA player ceases to be the Half once the ball is passed to another player.13.11\tDefending players are not to interfere with the performance of the Rollball or the \nHalf.Ruling = A Penalty to the Attacking Team at a point ten (10) metres directly Forward of the \nInfringement.13.12\tPlayers of the Defending Team must not move Forward of the Onside position \nuntil the Half has made contact with the ball, unless directed to do so by the \nReferee or in accordance with 13.12.1.13.12.1\tWhen the Half is not within one (1) metre of the Rollball, Onside players \nof the Defending Team may move Forward as soon as the player \nperforming the Rollball releases the ball.If the Half is not in position and \na defending player moves Forward and makes contact with the ball, a \nChange of Possession results.',
'18.7\tA player may perform a Rollball instead of a Penalty Tap and the player who \nreceives the ball does not become the Half.18.8\tIf the Defending Team is penalised three (3) times upon entering their Seven \nMetre Zone during a single Possession, the last offending player will be given an \nExclusion until the end of that Possession.18.9\tA Penalty Try is awarded if any action by a player, Team official or spectator, \ndeemed by the Referee to be contrary to the Rules or spirit of the game clearly \nprevents the Attacking Team from scoring a Try.FIT Playing Rules - 5th Edition\nCOPYRIGHT © Touch Football Australia 2020\n15\n19\u2002 Advantage \n19.1\tWhere a Defending Team player is Offside at a Tap or Rollball and attempts \nto interfere with play, the Referee will allow Advantage or award a Penalty, \nwhichever is of greater Advantage to the Attacking Team.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
question and related_chunk| question | related_chunk | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | related_chunk |
|---|---|
Where does a change of possession occur if a touch is made in In-Goal? |
Ruling = A Penalty to the non-offending Team at the point of the Infringement.10.4 If the ball is accidentally knocked from the hands of a player in Possession |
What section details the field of play in the Touch Rugby Rules 5th Edition? |
FIT Playing Rules - 5th Edition |
What is one of the Referee's responsibilities before the match commences? |
An approach may only be made during a break in play or at |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question and related_chunk| question | related_chunk | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | related_chunk |
|---|---|
Where must a player's identifying number be displayed? |
3.2 The ball shall be inflated to the manufacturers’ recommended air pressure.3.3 The Referee shall immediately pause the match if the size and shape of the ball |
Besides penalties, what other consequences might result from continuous rule breaches? |
20 Misconduct |
Can a Rollball be performed after a Touch has been made? |
Ruling = A Penalty to the Defending Team at the point of the Infringement.13.5 A player may only perform a Rollball at the Mark under the following |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 32learning_rate: 5e-06num_train_epochs: 1lr_scheduler_type: constantwarmup_ratio: 0.3overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-06weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: constantlr_scheduler_kwargs: {}warmup_ratio: 0.3warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.2 | 1 | - | 2.7507 |
| 0.4 | 2 | 3.6185 | 2.7254 |
| 0.6 | 3 | - | 2.7059 |
| 0.8 | 4 | 3.4585 | 2.6828 |
| 1.0 | 5 | - | 2.6653 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
answerdotai/ModernBERT-base