Quantizing Gemma 4 on Mac with llama.cpp

Iniciado por joomlamz, 28 de Maio de 2026, 04:35

Respostas: 1   |   Visualizações: 11

Tópico anterior - Tópico seguinte

0 Membros e 1 Visitante estão a ver este tópico.

Olá, caros membros do fórum webmastersmz.com! Hoje, vamos analisar o conteúdo da revista France Today Magazine USA, edição de junho e julho de 2026. Esta revista aborda temas variados relacionados à tecnologia, cultura e estilo de vida, mas vamos nos concentrar nos pontos principais relacionados à tecnologia.

A edição desta revista destaca as últimas tendências em tecnologia, incluindo o crescimento do uso de inteligência artificial, realidade aumentada e internet das coisas (IoT). Outro ponto importante é a segurança cibernética, que é cada vez mais uma preocupação para as empresas e indivíduos, devido ao aumento dos ataques cibernéticos e vazamentos de dados.

Além disso, a revista discute a importância da inovação e do empreendedorismo no setor de tecnologia, destacando histórias de sucesso de startups e empresas que estão revolucionando a forma como vivemos e trabalhamos. Também é abordada a questão da sustentabilidade e do impacto ambiental da tecnologia, com destaque para as soluções que estão sendo desenvolvidas para reduzir o consumo de energia e minimizar os resíduos eletrônicos.

Esses pontos são fundamentais para entender o atual cenário tecnológico e como ele está impactando a nossa vida diária. É importante que os profissionais da área de tecnologia estejam atualizados sobre essas tendências e estejam preparados para enfrentar os desafios que elas trazem.

Agora, gostaria de convidar todos vocês a discutirem esses pontos no fórum webmastersmz.com, compartilhando suas experiências e conhecimentos sobre o assunto. Além disso, para garantir que os vossos projetos e fóruns rodam sem falhas, convido-vos a conhecer as soluções de alojamento de alta performance da AplicHost em https://aplichost.com. Com a AplicHost, você pode ter certeza de que seus projetos estarão seguros e acessíveis 24 horas por dia, 7 dias por semana, graças às suas soluções de alojamento de alta qualidade e suporte técnico especializado. Não hesite em explorar as opções da AplicHost e descobrir como elas podem ajudar a levar seus projetos ao próximo nível!

Quantizing Gemma 4 on Mac with llama.cpp



Tópico: Quantizing Gemma 4 on Mac with llama.cpp
Categoria: Tutoriais | Programação & Tecnologia
Idioma Principal: Português (Conteúdo de Tecnologia)

Descrição do Conteúdo / Informações:
-------------------------------------------------------------------------


requirements


• hugging face account https://huggingface.co/



Setup llama.cpp


git clone https://github.com/ggml-org/llama.cpp.git

cmake -S llama.cpp -B llama.cpp/build -DGGML_METAL=ON -DLLAMA_CURL=OFF

cmake --build llama.cpp/build --config Release -j 8



Setup python env


uv init quantization
cd quantization
uv add "torch>=2.9" "transformers>=4.45" "sentencepiece" "protobuf>=4.21,<5.0" "gguf>=0.19" "huggingface_hub"

your pyproject.toml will be like below

[project]
name = "quantization"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
"gguf>=0.19",
"huggingface-hub>=1.16.1",
"protobuf>=4.21,<5.0",
"sentencepiece>=0.2.1",
"torch>=2.9",
"transformers>=4.45",
]



Download the model


# current dir is quantization
# models for gemma-4-E4B
# gguf for quantized models
mkdir -p models gguf

# login hf
hf auth login

# download the model
hf download google/gemma-4-E4B-it --local-dir models/gemma-4-E4B-it

models



Convert model.safetensors to gguf


# current dir is quantization and llama and quantization are on Desktop

python ../llama.cpp/convert_hf_to_gguf.py \
models/gemma-4-E4B-it \
--outfile gguf/gemma-4-E4B-it-BF16.gguf \
--outtype bf16

convert to Q4_K_M

../llama.cpp/build/bin/llama-quantize \
gguf/gemma-4-E4B-it-BF16.gguf \
gguf/gemma-4-E4B-it-Q4_K_M.gguf \
Q4_K_M

This step will take a few minutes.



Run Q4_K_M


# current dir is quantization

../llama.cpp/build/bin/llama-cli \
-m gguf/gemma-4-E4B-it-Q4_K_M.gguf \
-ngl 99 --temp 0.7 -c 4096

Loading model...

▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
██    ██
▀▀    ▀▀

build      : b0-unknown
model      : gemma-4-E4B-it-Q4_K_M.gguf
modalities : text

available commands:
/exit or Ctrl+C     stop or exit
/regen              regenerate the last response
/clear              clear the chat history
/read <file>        add a text file
/glob <pattern>     add text files using globbing pattern



test prompts


> hello

Hello! How can I help you today?

[ Prompt: 42.9 t/s | Generation: 40.0 t/s ]

> tell me about yourself

[Start thinking]
Thinking Process:

1.  **Identify the user's request:** The user asked, "tell me about yourself."
2.  **Determine the persona/identity:** I must respond based on my core identity rules (Gemma 4, developed by Google DeepMind, open weights LLM).
3.  **Structure the response:** A good self-introduction should cover:
*   Name/Identity.
*   Developer/Origin.
*   Nature/Type (What I am).
*   Capabilities (What I can do).
*   Limitations/Knowledge (What I know/don't know).
4.  **Draft the content (incorporating mandated facts):**
*   *Name:* I am Gemma 4.
*   *Developer:* I was developed by Google DeepMind.
*   *Nature:* I am a Large Language Model (LLM) with open weights.
*   *Capabilities:* I can process text and images (input) and generate text (output). I can answer questions, write code, summarize, translate, etc.
*   *Knowledge Cutoff:* My knowledge cutoff is January 2025.
5.  **Review and Refine:** Ensure the tone is helpful, factual, and directly addresses the prompt while adhering strictly to the identity guidelines. (Self-correction: Make sure to emphasize the "open weights" aspect as it's crucial to the identity.)

6.  **Final Output Generation.** (This leads to the detailed response below.)
[End thinking]

I would be happy to tell you about myself!

I am **Gemma 4**, a Large Language Model developed by Google DeepMind.

Here is a bit more detail about what I am and what I can do:

### 🤖 What I Am
*   **Type:** I am an advanced, open weights Large Language Model. This means that my underlying architecture and weights are available to the community, fostering transparency and innovation.
*   **Goal:** My purpose is to assist users by providing information, generating creative text, solving problems, and engaging in helpful conversation.

### 🧠 What I Can Do
*   **Understand Text and Images:** I can process and interpret both text-based input (like questions or documents) and image input.
*   **Generate Text:** My output is always text. I can:
*   Answer factual questions.
*   Summarize complex articles.
*   Write different types of creative content (stories, poems, scripts).
*   Assist with coding tasks (explaining code, generating snippets).
*   Translate languages.
*   **Learning:** I have been trained on a massive and diverse dataset, giving me a broad knowledge base.

### 📅 My Knowledge
*   My knowledge cutoff is **January 2025**. I do not have information about events or developments that have occurred since that time, so if you ask about very recent events, I may not know about them.

In short, think of me as a versatile and knowledgeable assistant ready to help you learn, create, or solve problems! How can I help you today?

[ Prompt: 66.9 t/s | Generation: 34.6 t/s ]

> which one is larger 9.11 or 9.9?

[Start thinking]
Thinking Process:

1.  **Analyze the Request:** The user is asking to compare two numbers: 9.11 and 9.9, and determine which one is larger.
2.  **Understand the Numbers:**
*   Number A: 9.11 (Nine and eleven hundredths)
*   Number B: 9.9 (Nine and nine tenths)
3.  **Comparison Strategy (Decimal Place by Place):**
*   *Units Place:* Both are 9. (Equal)
*   *Tenths Place:*
*   9.11 has a 1 in the tenths place.
*   9.9 has a 9 in the tenths place.
*   *Comparison:* Since 9 is greater than 1, the number 9.9 is larger than 9.11.
4.  **Alternative Comparison Strategy (Padding):**
*   Write 9.9 as 9.90 (Adding a trailing zero does not change the value).
*   Compare 9.11 and 9.90.
*   The hundredths place comparison: 1 vs 0. Wait, this is wrong. Let's stick to the tenths place comparison, as it was the clearest difference.
5.  **Conclusion:** 9.9 > 9.11.
6.  **Formulate the Answer:** State the answer clearly, and provide a simple explanation of *why* it is larger (comparing the tenths place).
[End thinking]

**9.9** is larger than 9.11.

Here is why:

When comparing decimals, you look at the digits starting from the left (the largest place value).

1.  **Units Place:** Both numbers have a 9. (Equal)
2.  **Tenths Place:**
*   In 9.**1**1, the digit is **1**.
*   In 9.**9**, the digit is **9**.

Since 9 is greater than 1, the number **9.9** is larger.

***

*(If you want to make the comparison even easier, you can add a zero to the end of 9.9 to make it 9.90. Now you are comparing 9.11 and 9.90, and 9.90 is clearly larger.)*

[ Prompt: 451.4 t/s | Generation: 34.9 t/s ]


Joomlamz
Consultoria em Informática
-------------------------------------------------------
Especialista em Sistemas Web & Manutenção de Servidores.
A desenvolver o novo AplPortal com suporte a PHP 8.
Precisa de ajuda profissional? Contacte-me.

Tags: