Folgen
Daniil Gavrilov
Daniil Gavrilov
Zugehörigkeit unbekannt
Keine bestätigte E-Mail-Adresse
Titel
Zitiert von
Zitiert von
Jahr
Self-attentive model for headline generation
D Gavrilov, P Kalaidin, V Malykh
Advances in Information Retrieval: 41st European Conference on IR Research …, 2019
722019
Learn your reference model for real good alignment
A Gorbatovski, B Shaposhnikov, A Malakhov, N Surnachev, Y Aksenov, ...
arXiv preprint arXiv:2404.09656, 2024
252024
Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning
E Lagutin, D Gavrilov, P Kalaidin
Proceedings of the 16th Conference of the European Chapter of the …, 2021
182021
PALBERT: Teaching ALBERT to Ponder
N Balagansky, D Gavrilov
Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 14002 …, 2022
142022
Diffusion Language Models Generation Can Be Halted Early
SM Lo Cicero Vaina, N Balagansky, D Gavrilov
arXiv e-prints, arXiv: 2305.10818, 2023
6*2023
Classifiers are better experts for controllable text generation
A Sitdikov, N Balagansky, D Gavrilov, A Markov
arXiv preprint arXiv:2205.07276, 2022
62022
Linear transformers with learnable kernel functions are better in-context models
Y Aksenov, N Balagansky, SMLC Vaina, B Shaposhnikov, A Gorbatovski, ...
arXiv preprint arXiv:2402.10644, 2024
52024
Linear interpolation in parameter space is good enough for fine-tuned language models
M Rofin, N Balagansky, D Gavrilov
arXiv preprint arXiv:2211.12092, 2022
32022
Mechanistic Permutability: Match Features Across Layers
N Balagansky, I Maksimov, D Gavrilov
arXiv preprint arXiv:2410.07656, 2024
22024
Ahead-of-Time P-Tuning
D Gavrilov, N Balagansky
arXiv preprint arXiv:2305.10835, 2023
22023
Weight squeezing: Reparameterization for extreme compression and fast inference
C Artem, G Daniil, B Nikita, K Pavel
arXiv: 2010.06993, 2020
22020
You Do Not Fully Utilize Transformer's Representation Capacity
G Gerasimov, Y Aksenov, N Balagansky, V Sinii, D Gavrilov
arXiv preprint arXiv:2502.09245, 2025
2025
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models
D Laptev, N Balagansky, Y Aksenov, D Gavrilov
arXiv preprint arXiv:2502.03032, 2025
2025
The Differences Between Direct Alignment Algorithms are a Blur
A Gorbatovski, B Shaposhnikov, V Sinii, A Malakhov, D Gavrilov
arXiv preprint arXiv:2502.01237, 2025
2025
Diffusion Language Models Generation Can Be Halted Early
SMLC Vaina, N Balagansky, D Gavrilov
arXiv preprint arXiv:2305.10818, 2023
2023
FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks
M Zubkov, D Gavrilov
arXiv preprint arXiv:2202.11364, 2022
2022
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–16