How to decide between fine-tuning transformers or a custom domain summarizer?
#1
I'm a machine learning engineer working on a document summarization project, and I'm trying to decide between fine-tuning a pre-trained transformer like BART or T5 versus building a custom architecture from scratch. Our dataset is domain-specific and relatively small. For others who have implemented transformer models for similar NLP tasks, what factors led you to choose one approach over the other? I'm particularly concerned about the computational cost of fine-tuning a large model versus the performance limitations of a smaller custom transformer, and whether techniques like knowledge distillation or parameter-efficient fine-tuning are viable for production systems where inference speed is critical.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: