Tags: unslothai/unsloth
Tags
Use UTF-8 for Python code-execution subprocess I/O (#6489 class) (#6548) * Use UTF-8 for Python code-execution subprocess I/O Studio's code-execution tool already tells the child to emit UTF-8 (PYTHONIOENCODING=utf-8 in _build_safe_env), but _python_exec writes the temp script and decodes the subprocess pipe with the OS default codec. On Windows (cp1252), non-ASCII in model-written code or its output -- arrows, CJK, emoji -- raises UnicodeEncodeError / UnicodeDecodeError and breaks execution. Complete the UTF-8 wiring in core/inference/tools.py: - write the temp script with encoding="utf-8" - decode _python_exec stdout as utf-8, errors="replace" - set PYTHONIOENCODING=utf-8 in _build_bypass_env too (matches _build_safe_env, so the bypass path's child also emits utf-8) The child is python with PYTHONIOENCODING=utf-8, so it emits UTF-8 regardless of the console code page and the decode is always correct. Shell execution via cmd.exe has a separate console-code-page story and is left to a follow-up. Refs #6489 * Scope Python exec UTF-8 env to Python tool * Make bash bypass test robust to a host-set PYTHONIOENCODING for PR #6548 Bypass mode preserves benign host env vars, so a host-set PYTHONIOENCODING was inherited into the bash bypass env and tripped the new assertion even though _bash_exec never adds it. Clear it in the test so the assertion checks _bash_exec, not the runner environment. --------- Co-authored-by: Lee Jackson <130007945+Imagineer99@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Studio: keep llama-server discovery from crashing on an access-denied… … candidate (#6268) * Studio: keep llama-server discovery from crashing on an access-denied candidate _find_llama_server_binary probed candidates with Path.is_file(), which raises PermissionError (WinError 5) when a path exists but is momentarily inaccessible (antivirus lock, an install replace in flight, an elevated-install ACL), aborting model validation. Treat a denied-but-present path as the real binary so discovery returns it; absent paths still skip. * Retry a transiently locked binary instead of returning a denied path Returning a still-denied path only moved the PermissionError to the next is_file() (probe_server_capabilities). Retry briefly so a transient lock clears and discovery returns an accessible path; on a persistent lock return nothing rather than a path downstream cannot stat. * Studio: do not fall back to another llama-server when a pinned one is locked A denied LLAMA_SERVER_PATH made discovery skip the explicit pin and run a lower-priority managed or PATH binary, so a load could silently use a stale or incompatible server. Split the probe into a file/absent/denied status: when the pinned path exists but stays access-denied, warn and stop rather than falling back to a different executable. * Studio: never downgrade past a denied pinned or managed llama-server Extend the no-fallback rule beyond LLAMA_SERVER_PATH: a present-but-denied UNSLOTH_LLAMA_CPP_PATH or managed ($STUDIO_HOME/llama.cpp, ~/.unsloth/llama.cpp) binary now reports temporarily-unavailable instead of silently launching a lower-priority legacy or PATH server. Shared _scan_pinned/_unavailable helpers; legacy in-tree and PATH stay genuine fallbacks (a denied candidate there just continues). * Studio: let diffusion asset lookup use a locked llama-server path for its dir DiffusionGemma does not run llama-server; _find_diffusion_assets only needs the install dir to find the adjacent llama-diffusion-gemma-visual-server. The no-fallback rule returning None on a transiently locked llama-server therefore hid an available visual-server and raised 'runner not found'. Add an include_denied option so diffusion lookup gets the locked path (its dir is all it needs), while inference keeps the no-denied-path, no-downgrade behavior. * Studio: report a locked llama-server as temporarily unavailable, not missing When the pinned/managed binary stays access-denied through the retries, discovery returns None and load_model raised 'binary not found', a terminal error that points users at reinstalling rather than retrying a transient AV/install lock. Reuse include_denied to detect the locked path and raise a distinct temporarily-unavailable, retry message instead. * Studio: GGUF preflight treats a locked llama-server as present The pre-download preflight (and so /api/inference/validate) used the default discovery, which returns None for a transiently access-denied binary, so it raised 'binary not found' for a binary that merely needs the lock to clear. Use include_denied so the existence check counts a locked binary as present; the load itself still reports a still-locked binary as temporarily unavailable.
studio: add keyboard navigation to model picker (#5628) * Fix model picker keyboard navigation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address model picker review feedback * Fix model picker tab order with row actions * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove model picker keyboard contract test --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Etherll <61019402+Etherll@users.noreply.github.com>
fix(studio): keep local GGUF vision on llama-server (#5770) * fix(studio): keep local GGUF vision on llama-server * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix(studio): lower local GGUF vision log level * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix(studio): find GGUF companions from variant dirs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Lee Jackson <130007945+Imagineer99@users.noreply.github.com> Co-authored-by: imagineer99 <samleejackson0@gmail.com>
DiffusionGemma: disable tools, enable artifacts canvas by default (#6255 ) DiffusionGemma serves via the visual runner, which streams per-step canvas frames so the answer resolves live in the bubble. The agentic tool loop (generate_chat_completion_with_tools) does not forward those frames, so whenever a tool pill (Search/Code) was on the live canvas silently vanished while text still streamed. DiffusionGemma is not a tool-calling target anyway, so report supports_tools=False for it: the chat always takes the frame-forwarding path, and the Search/Code pills disable themselves (a local model has no builtin web search either). Also turn the artifacts canvas on by default for DiffusionGemma so a full-HTML answer (e.g. a playable game) renders as an interactive sandboxed card without the user flipping the global artifacts toggle.
Fix macOS Apple Silicon installs resolving torch against x86_64 (#5976) * Fix macOS Apple Silicon installs that resolve torch against x86_64 On Apple Silicon, `uv venv --python 3.13` can reuse a cached x86_64 (Rosetta) CPython, often because uv itself is an x86_64 build. The resulting venv reports macosx_*_x86_64 to the wheel resolver, but PyTorch has shipped no macOS x86_64 wheels since 2.2.2, so the torch install fails with "no wheels with a matching platform tag (macosx_..._x86_64)". Two changes, both scoped to macOS arm64 and additive (no other install path is affected): - Create the venv with an arch-explicit `cpython-X.Y-macos-aarch64-none` request on Apple Silicon (no --python override), so uv cannot fall back to a cached x86_64 interpreter. - Harden the existing x86_64 venv guard: when the venv python cannot be executed (x86_64 binary on a Mac without Rosetta), the platform.machine() probe returns empty and the recreate was silently skipped. Fall back to reading the binary's Mach-O arch via lipo/file so migrated or pre-existing x86_64 venvs are still recreated as arm64. * Harden arm64 static-arch fallback: file -L and set -e safety Address review feedback on the lipo/file fallback: - uv symlinks the venv's bin/python to the base interpreter; plain `file` reports the symlink ("symbolic link to ...") and the arch substring never matches. Use `file -L` to dereference (lipo already follows the link). - Append `|| true` so the command substitution cannot abort the installer under set -e on a Mac that has neither lipo nor file. --------- Co-authored-by: danielhanchen <michaelhan2050@gmail.com>
PreviousNext