NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API CallsKinjal BasuIbrahim Abdelazizet al.2025EMNLP 2025