Golang Tutorial

Fundamentals

Control Statements

Functions & Methods

Structure

Arrays & Slices

String

Pointers

Interfaces

Concurrency

Rune in Golang

In Go, a rune represents a Unicode code point. Technically, it's an alias for the int32 type. Working with runes is essential when dealing with characters from various languages and scripts, as they often go beyond the ASCII range. This tutorial will guide you through the basics of working with runes in Go.

1. Basics of Runes

A rune literal is written as a character enclosed in single quotes:

package main

import "fmt"

func main() {
    var r rune = 'A'
    fmt.Println(r)  // Outputs: 65
}

The above example outputs 65, which is the Unicode code point for the character A.

2. Strings and Runes

Strings in Go are sequences of bytes, but they often contain textual data encoded as UTF-8. When ranging over a string, Go will iterate over its runes, not its bytes:

s := "Hello, ����"
for _, r := range s {
    fmt.Printf("%c ", r)  
}
// Outputs: H e l l o ,   �� �� 

3. Converting Strings to Rune Slices

If you want to get all runes from a string as a slice:

s := "Hello, ����"
runes := []rune(s)
fmt.Println(runes)  // Outputs: [72 101 108 108 111 44 32 19990 30028]

4. Length of Strings in Bytes vs. Runes

Strings have a length in bytes, but this might differ from their length in runes:

s := "����"
fmt.Println(len(s))       // Outputs: 6 (because "����" is 6 bytes in UTF-8)
fmt.Println(len([]rune(s)))  // Outputs: 2 (because "����" consists of 2 runes)

5. Manipulating Runes

Since runes are just integers, they can be manipulated using standard arithmetic:

r := 'A'
fmt.Printf("%c\n", r+1)  // Outputs: B

6. Special Runes

There are certain special runes predefined in Go, such as unicode.MaxRune, unicode.ReplacementChar, etc., available in the unicode package.

7. Checking Rune Properties

The unicode package provides several functions to test runes for various properties, such as whether they're letters, digits, etc.:

import "unicode"

r := 'A'
fmt.Println(unicode.IsLetter(r))  // Outputs: true
fmt.Println(unicode.IsDigit(r))   // Outputs: false

Key Takeaways:

  • A rune in Go represents a Unicode code point.
  • Go strings are UTF-8 encoded by default, and when you range over a string, you're ranging over its runes.
  • You can convert a string to a slice of runes if you need to manipulate individual characters.
  • The unicode package provides a set of handy functions to work with runes.

Understanding runes is essential when dealing with internationalized text, ensuring that your Go programs are inclusive and work seamlessly across different languages and scripts.

  1. Iterating over runes in Golang strings:

    Go treats strings as sequences of Unicode code points (runes). To iterate over runes, use a for range loop.

    package main
    
    import "fmt"
    
    func iterateRunes(str string) {
        for _, r := range str {
            fmt.Printf("%c ", r)
        }
    }
    
    func main() {
        text := "Hello, ����"
        iterateRunes(text)
    }
    
  2. Converting between runes and strings in Golang:

    You can convert between runes and strings using type conversions.

    func runeToString(r rune) string {
        return string(r)
    }
    
    func stringToRunes(str string) []rune {
        return []rune(str)
    }
    
  3. Rune functions in the Golang unicode package:

    The unicode package provides functions for working with runes, such as checking for uppercase, lowercase, or title case.

    import "unicode"
    
    func isUpperCase(r rune) bool {
        return unicode.IsUpper(r)
    }
    
  4. Golang utf-8 encoding and decoding with runes:

    Go uses UTF-8 encoding for strings. You can encode and decode runes using the utf8 package.

    import "unicode/utf8"
    
    func encodeRune(r rune) []byte {
        buf := make([]byte, utf8.RuneLen(r))
        utf8.EncodeRune(buf, r)
        return buf
    }
    
    func decodeRune(encoded []byte) (r rune, size int) {
        r, size = utf8.DecodeRune(encoded)
        return
    }
    
  5. Common pitfalls and issues with runes in Golang:

    Be cautious with assumptions about string indices, as Unicode characters may span multiple bytes. Use the utf8 package for accurate character counting.

    import "unicode/utf8"
    
    func runeCount(str string) int {
        return utf8.RuneCountInString(str)
    }