Top Qs
Timeline
Chat
Perspective
Cyclone (programming language)
Memory-safe dialect of the C programming language From Wikipedia, the free encyclopedia
Remove ads
The Cyclone programming language was intended to be a safe dialect of the C language.[2] It avoids buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for system programming. It is no longer supported by its original developers, with the reference tooling not supporting 64-bit platforms. The Rust language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.[3]
This article includes a list of general references, but it lacks sufficient corresponding inline citations. (August 2015) |
Cyclone development was started as a joint project of Trevor Jim from AT&T Labs Research and Greg Morrisett's group at Cornell University in 2001. Version 1.0 was released on May 8, 2006.[4]
Remove ads
Language features
Summarize
Perspective
Cyclone attempts to avoid some of the common pitfalls of C, while still maintaining its look and performance. To this end, Cyclone places the following limits on programs:
NULLchecks are inserted to prevent segmentation faults- Pointer arithmetic is limited
- Pointers must be initialized before use (this is enforced by definite assignment analysis)
- Dangling pointers are prevented through region analysis and limits on
free() - Only "safe" casts and unions are allowed
gotointo scopes is disallowedswitchlabels in different scopes are disallowed- Pointer-returning functions must execute
return setjmpandlongjmpare not supported
To maintain the tool set that C programmers are used to, Cyclone provides the following extensions:
- Never-
NULLpointers do not requireNULLchecks - "Fat" pointers support pointer arithmetic with run-time bounds checking
- Growable regions support a form of safe manual memory management
- Garbage collection for heap-allocated values
- Smart pointers, such as unique pointers
- Region-based memory management
- Tagged unions support type-varying arguments
- Injections help automate the use of tagged unions for programmers
- Polymorphism replaces some uses of
void* - Variadic arguments are implemented as fat pointers
- Exceptions replace some uses of
setjmpandlongjmp - Namespaces
- Type inference
- Pattern matching
- Templates, parameterized types
For a better high-level introduction to Cyclone, the reasoning behind Cyclone and the source of these lists, see this paper.
Cyclone looks, in general, much like C, but it should be viewed as a C-like language.
Pointer types
Cyclone implements three kinds of pointer:
*(the normal type)@(the never-NULLpointer), and?(the only type with pointer arithmetic allowed, "fat" pointers).
The purpose of introducing these new pointer types is to avoid common problems when using pointers. Take for instance a function, called foo that takes a pointer to an int:
int foo(int* p);
Although the person who wrote the function foo could have inserted NULL checks, let us assume that for performance reasons they did not. Calling foo(NULL); will result in undefined behavior (typically, although not necessarily, a SIGSEGV signal being sent to the application). To avoid such problems, Cyclone introduces the @ pointer type, which can never be NULL. Thus, the "safe" version of foo would be:
int foo(int@ p);
This tells the Cyclone compiler that the argument to foo should never be NULL, avoiding the aforementioned undefined behavior. The simple change of * to @ saves the programmer from having to write NULL checks and the operating system from having to trap NULL pointer dereferences. This extra limit, however, can be a rather large stumbling block for most C programmers, who are used to being able to manipulate their pointers directly with arithmetic. Although this is desirable, it can lead to buffer overflows and other "off-by-one"-style mistakes. To avoid this, the ? pointer type is delimited by a known bound, the size of the array. Although this adds overhead due to the extra information stored about the pointer, it improves safety and security. Take for instance a simple (and naïve) strlen function, written in C:
int strlen(const char* s) {
int i = 0;
if (!s) {
return 0;
}
while (s[i] != '\0') {
i++;
}
return i;
}
This function assumes that the string being passed in is terminated by '\0'. However, what would happen if char buf[6] = {'h','e','l','l','o','!'}; were passed to this string? This is perfectly legal in C, yet would cause strlen to iterate through memory not necessarily associated with the string s. There are functions, such as strnlen which can be used to avoid such problems, but these functions are not standard with every implementation of ANSI C. The Cyclone version of strlen is not so different from the C version:
int strlen(const char? s) {
int n = s.size;
if (!s) {
return 0;
}
for (int i = 0; i < n; i++, s++) {
if (*s == '\0') {
return i;
}
}
return n;
}
Here, strlen bounds itself by the length of the array passed to it, thus not going over the actual length. Each of the kinds of pointer type can be safely cast to each of the others, and arrays and strings are automatically cast to ? by the compiler. (Casting from ? to * invokes a bounds check, and casting from ? to @ invokes both a NULL check and a bounds check. Casting from * to ? results in no checks whatsoever; the resulting ? pointer has a size of 1.)
Dangling pointers and region analysis
Consider the following code, in C:
char* itoa(int i) {
char buf[20];
sprintf(buf, "%d", i);
return buf;
}
The function itoa allocates an array of chars buf on the stack and returns a pointer to the start of buf. However, the memory used on the stack for buf is deallocated when the function returns, so the returned value cannot be used safely outside of the function. While GNU Compiler Collection and other compilers will warn about such code, the following will typically compile without warnings:
char* itoa(int i) {
char buf[20];
sprintf(buf, "%d", i);
char* z = buf;
return z;
}
GNU Compiler Collection can produce warnings for such code as a side-effect of option -O2 or -O3, but there are no guarantees that all such errors will be detected.
Cyclone does regional analysis of each segment of code, preventing dangling pointers, such as the one returned from this version of itoa. All of the local variables in a given scope are considered to be part of the same region, separate from the heap or any other local region. Thus, when analyzing itoa, the Cyclone compiler would see that z is a pointer into the local stack, and would report an error.
Fat pointers
A fat pointer is used for allowing pointer arithmetic. Fat pointers must be declared with @fat. For example, argv is often declared as type char** (a pointer to a pointer to a character), or alternatively thought of as char*[] (pointer to an array of characters). In Cyclone, this is instead expressed as char*@fat*@fat (a fat pointer to a fat pointer to characters).
Cyclone instead allows ? to represent *@fat. Thus, the two declarations are equivalent:
int main(int argc, char?? argv);
// equivalent to the more verbose declaration
int main(int argc, char*@fat*@fat argv);
Parameterized types
Similar to templates in C++, Cyclone has a form of generic programming.
typedef struct LinkedList<`a> {
`a head;
struct LinkedList<`a>* next;
} LinkedList<`a>;
// ...
LinkedList<int>* ll = new LinkedList{1, new LinkedList{2, null}};
An "abstract" type can be used, that encapsulates the implementation type but ensures the definition does not leak to the client.
abstract struct Queue<`a> {
LinkedList<`a> front;
LinkedList<`a> rear;
};
extern struct Queue<`a>;
Namespaces
Namespaces exist in Cyclone, similar to C++. Namespaces are used to avoid name clashes in code, and follow the :: notation as in C++. Namespaces can be nested.
namespace foo {
int x;
int f() {
return x;
}
}
namespace bar {
using foo {
int g() {
return f();
}
}
int h() {
return foo::f();
}
}
Pattern matching
Pattern-matching can be accomplished in Cyclone like so:
int g(int a, int b) {
switch ($(a, b - 1)) {
case $(0, y) && y > 1:
return 1;
case $(3, y) && f(x + y) == 7:
return 2;
case $(4, 72):
return 3;
default:
return 4;
}
}
A let declaration is used to match a pattern and expression.
typedef struct Pair {
int x;
int y;
} Pair;
void f(Pair p) {
let Pair(first, second) = p;
// equivalent to:
// int first = p.x;
// int second = p.y;
// ...
}
Type inference
In Cyclone, rather than using auto like C and C++ or var in Java and C#, Cyclone instead uses _ (an underscore) to denote a type-inferred variable.
_ x = (SomeType*)malloc(sizeof(SomeType));
// instead of:
SomeType x = (SomeType*)malloc(sizeof(SomeType));
_ myNumber = 100; // inferred to int
Exceptions
Cyclone features exceptions. An uncaught exception will halt the program. Like Java, Cyclone features a null pointer exception, called Null_Exception.
typedef FILE File;
File* f = fopen("/etc/passwd", "r");
try {
int code = getc((File* @notnull)f);
} catch {
case &Null_Exception:
printf("Error: can't open /etc/passwd\n");
return 1;
case &Invalid_argument(s):
printf("Error: invalid argument: %s\n", s);
return 1;
}
One can also manually throw exceptions:
throw new Null_Exception("This is a null exception");
Remove ads
See also
References
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads