Golang Tutorial
Fundamentals
Control Statements
Functions & Methods
Structure
Arrays & Slices
String
Pointers
Interfaces
Concurrency
In Go, a rune represents a Unicode code point. Technically, it's an alias for the int32
type. Working with runes is essential when dealing with characters from various languages and scripts, as they often go beyond the ASCII range. This tutorial will guide you through the basics of working with runes in Go.
A rune literal is written as a character enclosed in single quotes:
package main import "fmt" func main() { var r rune = 'A' fmt.Println(r) // Outputs: 65 }
The above example outputs 65
, which is the Unicode code point for the character A
.
Strings in Go are sequences of bytes, but they often contain textual data encoded as UTF-8. When ranging over a string, Go will iterate over its runes, not its bytes:
s := "Hello, ����" for _, r := range s { fmt.Printf("%c ", r) } // Outputs: H e l l o , �� ��
If you want to get all runes from a string as a slice:
s := "Hello, ����" runes := []rune(s) fmt.Println(runes) // Outputs: [72 101 108 108 111 44 32 19990 30028]
Strings have a length in bytes, but this might differ from their length in runes:
s := "����" fmt.Println(len(s)) // Outputs: 6 (because "����" is 6 bytes in UTF-8) fmt.Println(len([]rune(s))) // Outputs: 2 (because "����" consists of 2 runes)
Since runes are just integers, they can be manipulated using standard arithmetic:
r := 'A' fmt.Printf("%c\n", r+1) // Outputs: B
There are certain special runes predefined in Go, such as unicode.MaxRune
, unicode.ReplacementChar
, etc., available in the unicode
package.
The unicode
package provides several functions to test runes for various properties, such as whether they're letters, digits, etc.:
import "unicode" r := 'A' fmt.Println(unicode.IsLetter(r)) // Outputs: true fmt.Println(unicode.IsDigit(r)) // Outputs: false
unicode
package provides a set of handy functions to work with runes.Understanding runes is essential when dealing with internationalized text, ensuring that your Go programs are inclusive and work seamlessly across different languages and scripts.
Iterating over runes in Golang strings:
Go treats strings as sequences of Unicode code points (runes). To iterate over runes, use a for range
loop.
package main import "fmt" func iterateRunes(str string) { for _, r := range str { fmt.Printf("%c ", r) } } func main() { text := "Hello, ����" iterateRunes(text) }
Converting between runes and strings in Golang:
You can convert between runes and strings using type conversions.
func runeToString(r rune) string { return string(r) } func stringToRunes(str string) []rune { return []rune(str) }
Rune functions in the Golang unicode package:
The unicode
package provides functions for working with runes, such as checking for uppercase, lowercase, or title case.
import "unicode" func isUpperCase(r rune) bool { return unicode.IsUpper(r) }
Golang utf-8 encoding and decoding with runes:
Go uses UTF-8 encoding for strings. You can encode and decode runes using the utf8
package.
import "unicode/utf8" func encodeRune(r rune) []byte { buf := make([]byte, utf8.RuneLen(r)) utf8.EncodeRune(buf, r) return buf } func decodeRune(encoded []byte) (r rune, size int) { r, size = utf8.DecodeRune(encoded) return }
Common pitfalls and issues with runes in Golang:
Be cautious with assumptions about string indices, as Unicode characters may span multiple bytes. Use the utf8
package for accurate character counting.
import "unicode/utf8" func runeCount(str string) int { return utf8.RuneCountInString(str) }