You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hugging Face offers support for serverless inference for models on their model hub with [Hugging Face's Inference API](https://huggingface.co/docs/inference-providers/en/index).
751
+
752
+
<Note>
753
+
Note that not every model on Hugging Face has native support for tool use and function calling.
754
+
</Note>
755
+
756
+
<CodeGroup>
757
+
```python
758
+
from huggingface_hub import InferenceClient
759
+
from e2b_code_interpreter import Sandbox
760
+
import re
761
+
762
+
# Not all models are capable of direct tools use - we need to extract the code block manually and prompting the LLM to generate the code.
763
+
defmatch_code_block(llm_response):
764
+
pattern = re.compile(r'```python\n(.*?)\n```', re.DOTALL) # Match everything in between ```python and ```
765
+
match = pattern.search(llm_response)
766
+
if match:
767
+
code = match.group(1)
768
+
print(code)
769
+
return code
770
+
return""
771
+
772
+
773
+
system_prompt ="""You are a helpful coding assistant that can execute python code in a Jupyter notebook. You are given tasks to complete and you run Python code to solve them.
774
+
Generally, you follow these rules:
775
+
- ALWAYS FORMAT YOUR RESPONSE IN MARKDOWN
776
+
- ALWAYS RESPOND ONLY WITH CODE IN CODE BLOCK LIKE THIS:
777
+
\`\`\`python
778
+
{code}
779
+
\`\`\`
780
+
"""
781
+
prompt ="Calculate how many r's are in the word 'strawberry.'"
782
+
783
+
# Initialize the client
784
+
client = InferenceClient(
785
+
provider="hf-inference",
786
+
api_key="HF_INFERENCE_API_KEY"
787
+
)
788
+
789
+
completion = client.chat.completions.create(
790
+
model="Qwen/Qwen3-235B-A22B", # Or use any other model from Hugging Face
0 commit comments