Published inTDS ArchiveCombining Large and Small LLMs to Boost Inference Time and QualityImplementing Speculative and Contrastive DecodingDec 5, 20241Dec 5, 20241